0% found this document useful (0 votes)
46 views119 pages

MLT Unit 1 & 2

Uploaded by

Shrishti Bhasin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views119 pages

MLT Unit 1 & 2

Uploaded by

Shrishti Bhasin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 119

PRACTICAL MACHINE

LEARNING WITH
TENSORFLOW
Tensorsflow

• TensorFlow is a popular framework of machine


learning and deep learning. It is a free and open-
source library which is released on 9 November 2015
and developed by Google Brain Team. It is entirely
based on Python programming language and use for
numerical computation and data flow, which makes
machine learning faster and easier
The word TensorFlow is made by two words, i.e.,
Tensor and Flow

• Tensor is a multidimensional array


• Flow is used to define the flow of data in operation.
• TensorFlow is used to define the flow of data in
operation on a multidimensional array or Tensor.
• It is mainly used for deep learning or machine
learning problems such as
• Classification,
• Perception,
• Understanding,
• Discovering Prediction, and Creation.
Features of Tensorflow
1. Responsive Construct

• We can visualize each part of the graph, which is not


an option while using Numpy or SciKit. To develop a
deep learning application, firstly, there are two or
three components that are required to create a deep
learning application and need a programming
language.
• 2. Flexible
• It is one of the essential TensorFlow Features
according to its operability. It has modularity and
parts of it which we want to make standalone.

• 3. Easily Trainable
• It is easily trainable on CPU and for GPU in
distributed computing.
4. Parallel Neural Network Training
• TensorFlow offers to the pipeline in the sense that we
can train multiple neural networks and various GPUs,
which makes the models very efficient on large-scale
systems.
5. Large Community
• Google has developed it, and there already is a large
team of software engineers who work on stability
improvements continuously.

6. Open Source
• The best thing about the machine learning library is
that it is open source so anyone can use it as much as
they have internet connectivity.
7. Feature Columns -
• TensorFlow has feature columns which could be
thought of as intermediates between raw data and
estimators; accordingly, bridging input data with our
model.
8. Availability of Statistical Distributions
• This library provides distributions functions including
Bernoulli, Beta, Chi2, Uniform, Gamma, which are
essential, especially where considering probabilistic
approaches such as Bayesian models.
• 9. Layered Components
• TensorFlow produces layered operations of weight
and biases from the function such as tf.contrib.layers
and also provides batch normalization, convolution
layer, and dropout layer.
• So tf.contrib.layers.optimizers have optimizers such
as Adagrad, SGD, Momentum which are often used
to solve optimization problems for numerical analysis.
• 10. Visualizer (With TensorBoard)
• We can inspect a different representation of a model
and make the changed necessary while debugging it
with the help of TensorBoard.

• 11.Event Logger (With TensorBoard)


• It is just like UNIX, where we use tail - f to monitor
the output of tasks at the cmd. It checks, logging
events and summaries from the graph and production
with the TensorBoard.
FEATURE VECTOR
USE CASE 1

• Let's say you're teaching a robot how to fold clothes.


The perception system sees a shirt lying on a table
(figure). Decide which features would be most useful
to track.
USE CASE 2

• Now, instead of detecting clothes, you


ambitiously decide to detect arbitrary objects.
What are some salient features that can easily
differentiate objects?

Figure: Here are images of three objects: a lamp, a pair of pants, and a dog.
What are some good features that you should record to compare and
differentiate objects?
Figure 1.9 Feature vectors are a representation of real world data used by both the
learning and inference components of machine learning. The input to the
algorithm is not the real-world image directly, but instead its feature vector.
Representation of a Tensor

• An ordered list of some features is called a feature


vector, and that’s exactly what we’ll represent in
TensorFlow code.
• In TensorFlow, a tensor is the collection of feature
vector (Like, array) of n-dimension.
• A matrix concisely represents a list of vectors,
where each column of a matrix is a feature vector.
• The syntax to represent matrices in TensorFlow is a
vector of vectors, each of the same length.
Tensors

• All computations pass through one or more Tensors


in TensorFlow. A tensor is an object which has three
properties which are as follows:

• A unique label (name)


• A dimension (shape)
• A data type (dtype)
There are four main tensors we can create:

• tf.Variable
• tf.constant - Creates a constant tensor from a tensor-
like object.
• tf.placeholder
• tf.SparseTensor
Creating tensors
OUTPUT
• As you can see from the output, each tensor is represented by
the aptly named Tensor object.
• Each Tensor object has a unique label (name), a dimension
(shape) to define its structure, and data type (dtype) to specify
the kind of values we will manipulate. Because we did not
explicitly provide a name, the library automatically generated
the names: “Const:0”, “Const_1:0”, and “Const_2:0”.
for example
Creating Operator

• Below is a list of commonly used operations. The


idea is the same. Epoch operation requires one or
many arguments.
• tf.exp(a)
• tf.sqrt(a)
• tf.add(a,b)
• tf.substract(a,b)
• tf.multiply(a,b)
• tf.div(a,b)
• tf.pow(a,b)
• tf.add(x, y)  Add two tensors of the same type, x + y
• tf.subtract(x, y)  Subtract tensors of the same type, x - y
• tf.multiply(x, y)  Multiply two tensors element-wise
• tf.pow(x, y)  Take the element-wise power of x to y
• tf.exp(x)  Equivalent to pow(e, x), where e is Euler’s number (2.718…)
• tf.sqrt(x)  Equivalent to pow(x, 0.5)
• tf.div(x, y)  Take the element-wise division of x and y
Types of data
• The second property of the tensor is the type of data.
• A tensor can only one type of data at one time.
• We can return the type with the property dtype.
Tensor data types
• In addition to dimensionality, tensors have a fixed data
type. You can assign any one of the following data
types to a tensor:
• tensor_a=tf.constant([3,4]], dtype=tf.int32)
• tensor_b=tf.constant([[1,2]], dtype=tf.int32)
• tensor_add=tf.add(tensor_a, tensor_b)
• print(tensor_add)
Graph

• The graph is essential in TensorFlow. All the


mathematical operations (ops) are performed inside
the graph.
• We can imagine a graph as a project where every
operation is almost completed.
• The nodes represent these operations, and they can
delete or create new tensors.
• The graph shows a node and edge.
• The node is the representation of operation, i.e., the
unit of computation.
• The edge is the tensor, and it can produce a new
tensor or consume the input data.
• It depends on the dependencies between individual
operations.
• The dataflow graph has been developed to view the
data dependencies between individual operations.
Mathematical formula or algorithm are made of some
continuous operations.
• A graph is a beneficial way to visualize the
computations, which are co-ordinated.
• The structure of the graph connects the operations
(i.e., the nodes) and how those operations are feed.
Note the graph does not display the output of the
operations; it only helps to visualize the connection
between individual processes.
• Example:
• Imagine we want to evaluate the given
function:
• d=b+c
• e=c+2
• a=d*e
SESSION
• To communicate the graph to the values of a tensor, we need to
open a session.
• Inside a session, we must run an operator to create an output.
• A Session object encapsulates the environment in which
Operation objects are executed, and Tensor objects are
evaluated.
• Session
• A session can execute the operation from the graph. To feed
the graph with the value of a tensor, we need to open a session.
Inside a session, we must run an operator to create an output.
tf.compat.v1.disable_eager_execution() # need to disable eager in
TF2.x
# Build a graph.
a = tf.constant(5.0)
b = tf.constant(6.0)
c=a*b

# Launch the graph in a session.


sess = tf.compat.v1.Session()

# Evaluate the tensor `c`.


print(sess.run(c)) # prints 30.0A
• In the example below, we will:

• Create two tensors


• Create an operation
• Open a session
• Print the result
• Explanation of Code
• tf.Session(): Open a session. All the operations will
flow with the sessions
• Run (Multiply): Execute the operation which is
created in step2.
• print(result_1): Finally, we can print the result
• close(): Close the session
tensorflow playground

• https://www.javatpoint.com/tensorflow-
playground
Tensorflow Playground

• Visualizing summarized information is a vital part of


any data scientist's toolbox.
• TensorBoard is a software utility that allows the
graphical representation of the data flow graph and a
dashboard used for the interpretation of results,
normally coming from the logging utilities:
• TensorFlow Playground is a web app that allows
users to test the artificial intelligence (AI)
algorithm with TensorFlow machine learning
library.

• TensorFlow Playground is unfamiliar with high-


level maths and coding with neural network for
deep learning and other machine learning
application. Neural network operations are
interactive and represented in the Playground.
Tensorflow 2.0 Architecture

• https://blog.tensorflow.org/2019/01/whats-coming-in-
tensorflow-2-0.html
TensorFlow 2.0 Architecture
TensorFlow 2.0. TensorFlow 2.0 will focus on
simplicity and ease of use, featuring updates like:
• Easy model building with Keras and eager execution.
• Robust model deployment in production on any
platform.
• Powerful experimentation for research.
• Simplifying the API by cleaning up deprecated APIs
and reducing duplication.
TensorFlow 2.0 Architecture
Load your data using tf.data -

• Training data is read using input pipelines which are


created using tf.data. Feature characteristics, for
example bucketing and feature crosses are described
using tf.feature_column. Convenient input from in-
memory data (for example, NumPy) is also supported.
• Build, train and validate your model with tf.keras,
or use Premade Estimators.
• Keras integrates tightly with the rest of TensorFlow
so you can access TensorFlow’s features whenever
you want. A set of standard packaged models (for
example, linear or logistic regression, gradient
boosted trees, random forests) are also available to
use directly (implemented using the tf.estimator API).
If you’re not looking to train a model from scratch,
you’ll soon be able to use transfer learning to train a
Keras or Estimator model using modules from
TensorFlow Hub.
• Run and debug with eager execution, then use
tf.function for the benefits of graphs. TensorFlow 2.0
runs with eager execution by default for ease of use
and smooth debugging. Additionally, the tf.function
annotation transparently translates your Python
programs into TensorFlow graphs. This process
retains all the advantages of 1.x TensorFlow graph-
based execution: Performance optimizations, remote
execution and the ability to serialize, export and
deploy easily, while adding the flexibility and ease of
use of expressing programs in simple Python.
• Use Distribution Strategies for distributed
training. For large ML training tasks, the Distribution
Strategy API makes it easy to distribute and train models on
different hardware configurations without changing the model
definition. Since TensorFlow provides support for a range of
hardware accelerators like CPUs, GPUs, and TPUs, you can
enable training workloads to be distributed to single-
node/multi-accelerator as well as multi-node/multi-accelerator
configurations, including TPU Pods. Although this API
supports a variety of cluster configurations, templates to
deploy training on Kubernetes clusters in on-prem or cloud
environments are provided.
Export to SavedModel -
• TensorFlow will standardize on SavedModel as an
interchange format for TensorFlow Serving,
TensorFlow Lite, TensorFlow.js, TensorFlow Hub,
and more.
Keras and its architecture
• Input Layer -> Hidden Layers -> Output Layer
• Input Layer: This is the first layer in the model and serves as the entry point for the
input data. It specifies the shape of the input data (e.g., the number of features or
dimensions).

• Hidden Layers: These are the intermediate layers between the input and output
layers. Each hidden layer consists of one or more neurons (nodes) and applies
various operations on the data. Common types of hidden layers include Dense (fully
connected), Convolutional, Recurrent, etc., depending on the type of model being
built.

• Output Layer: This is the final layer of the model and produces the output predictions.
The number of neurons in this layer corresponds to the number of output classes or
the desired output dimensions.
• The arrows in the diagram represent the
flow of information from one layer to
another, with each layer's output serving
as the input to the next layer
KERAS APPLICATION
• Keras applications module is used to provide pre-trained model for
deep neural networks. Keras models are used for prediction, feature
extraction and fine tuning. This chapter explains about Keras
applications in detail.

• Pre-trained models
• Trained model consists of two parts model Architecture and model
Weights. Model weights are large file so we have to download and
extract the feature from ImageNet database. Some of the popular
pre-trained models are listed below,

• ResNet
• VGG16
• MobileNet
• InceptionResNetV2
• InceptionV3
UNIT --2
EXPLORING THE DATA WITH
TENSORFLOW FRAMEWORK
What is a Binary Classifier in Machine Learning?

• A binary classifier in machine learning is a type of model that


is trained to classify data into one of two possible categories,
typically represented as binary labels such as 0 or 1, true or
false, or positive or negative.

• For example, a binary classifier may be trained to distinguish


between spam and non-spam emails, or to predict whether a
credit card transaction is fraudulent or legitimate.
• Binary classifiers are fundamental building block of many
machine learning applications, and there are numerous
algorithms that can be used to build them, including logistic
regression, support vector machines (SVMs), decision trees,
random forests, and neural networks.
• These models are typically trained using labeled data, where
the correct label or category for each example in the training
set is known, and then used to predict the category of new,
unseen examples.
What is Artificial Neuron

• An artificial neuron is a mathematical function based on a


model of biological neurons, where each neuron takes inputs,
weighs them separately, sums them up and passes this sum
through a nonlinear function to produce output.
Single Layer Perceptron in TensorFlow

• The perceptron is a single processing unit of any


neural network. Frank Rosenblatt first proposed in
1958 is a simple neuron which is used to classify its
input into one or two categories.
• Perceptron is a linear classifier, and is used in
supervised learning.
• It helps to organize the given input data.
• A perceptron is a neural network unit that does a precise
computation to detect features in the input data.
• Perceptron is mainly used to classify the data into two parts.
Therefore, it is also known as Linear Binary Classifier.
• A Perceptron is an algorithm for supervised learning of binary
classifi ers. This algorithm enables neurons to learn and
processes elements in the training set one at a time.
• A machine-based algorithm used for supervised learning of
various binary sorting tasks is called Perceptron.
A regular neural network looks like this:
• The perceptron consists of 5 parts.
1. Input value or One input layer:
2. Weights and Bias:
3. Net sum:
4. Activation Function:
5. Output
1. Input Layer: The input layer consists of one or more input
neurons, which receive input signals from the external world or
from other layers of the neural network.

2. Weights: Each input neuron is associated with a weight, which


represents the strength of the connection between the input
neuron and the output neuron.

3. Bias: A bias term is added to the input layer to provide the


perceptron with additional flexibility in modeling complex
patterns in the input data.
4. Activation Function: The activation function determines the
output of the perceptron based on the weighted sum of the inputs
and the bias term. Common activation functions used in
perceptrons include the step function, sigmoid function, and
ReLU function.

5. Output: The output of the perceptron is a single binary value,


either 0 or 1, which indicates the class or category to which the
input data belongs.
6. Training Algorithm: The perceptron is typically trained using
a supervised learning algorithm such as the perceptron learning
algorithm or backpropagation. During training, the weights and
biases of the perceptron are adjusted to minimize the error
between the predicted output and the true output for a given set of
training examples.

Overall, the perceptron is a simple yet powerful algorithm that


can be used to perform binary classification tasks and has paved
the way for more complex neural networks used in deep learning
today.
A standard neural network looks like the below
diagram.
How does it work?
The perceptron works on these simple steps which
are given below:
Step - 1: In the first step, all the inputs x are
multiplied with their weights w.
Step 1: Multiply all input values with corresponding weight
values and then add to calculate the weighted sum. The following
is the mathematical expression of it:

∑wi*xi = x1*w1 + x2*w2 + x3*w3+……..x4*w4

Add a term called bias ‘b’ to this weighted sum to


improve the model’s performance
Step 2-. In this step, add all the increased
values and call them the Weighted sum.
Step 2: An activation function is applied with the above-
mentioned weighted sum giving us an output either in binary
form or a continuous value as follows:

Y=f(∑wi*xi + b)
Advantages of SLP
1. Linear Separability: Single-layer perceptrons are effective at
solving linearly separable classification problems. If the
classes can be separated by a single hyperplane, a single-layer
perceptron can learn to classify them accurately. For example,
classifying points in a 2D space into two distinct classes using
a straight line.
2. Low Computational Complexity: Due to their single-layer
architecture, training and using a single-layer perceptron is
computationally efficient. This makes them suitable for tasks
where real-time or rapid decision-making is required.
3 . Simplicity and Interpretability: Single-layer perceptrons have
a straightforward structure, which makes them easy to understand
and interpret. This can be useful for educational purposes or
when a simple, transparent model is desired.

4. Feature Extraction: Single-layer perceptrons can be used for


basic feature extraction tasks, such as identifying important
patterns or trends in data. This can be helpful for preprocessing
data before using it with more complex models.
5. Logical Operations: Single-layer perceptrons can be
employed to implement logical operations, such as AND, OR,
and NOT gates. These operations are useful in various
computing and control systems.

6. Early Learning Framework: Single-layer perceptrons are


historically significant as they paved the way for the
development of more sophisticated neural network
architectures. They provide insight into the fundamental
principles of neural networks and how learning algorithms
can be applied.
Disadvantages of SLP
• Limited to Linear Separation: Perhaps the most significant
limitation of single-layer perceptrons is that they can only
model and classify linearly separable data. This means they
cannot effectively handle problems where the decision
boundary is nonlinear, leading to poor performance on tasks
that involve complex patterns or relationships.
• Inability to Learn XOR and Other Complex Operations:
Single-layer perceptrons are unable to learn basic nonlinear
operations like the XOR function. XOR requires a more
complex decision boundary that cannot be represented by a
single hyperplane.
• Limited Feature Learning: Single-layer perceptrons have
minimal ability to automatically extract meaningful features
from raw data. They rely on hand-engineered features, which
can be time-consuming and may not capture the full
complexity of the data.
• Vulnerability to Outliers: Single-layer perceptrons are
sensitive to outliers in the training data, which can lead to the
misclassification of points and reduced generalization
performance.
• Lack of Hidden Representations: Single-layer perceptrons do
not have hidden layers, which means they cannot learn
hierarchical representations of data. Hidden layers are essential
for capturing complex features and abstractions in more
advanced neural network architectures.

• Not Suitable for Complex Data Domains: Real-world data


often exhibit nonlinear relationships, and single-layer
perceptrons are ill-equipped to handle such complexity. They
struggle with tasks like image and speech recognition, where
patterns and features can be highly intricate.
Hidden layer perceptron
• Hidden Layers: Hidden layers are the core of the architecture.
They are responsible for capturing and transforming the input
data through a series of transformations. A typical hidden layer
consists of multiple neurons, each connected to all neurons in
the previous layer. The number of hidden layers and the
number of neurons in each layer are configurable
hyperparameters.
• The middle part of the diagram represents the hidden layers.
These layers perform a series of transformations on the input
data to learn complex patterns and relationships. Each "O" in
the hidden layers represents a neuron in those layers. The
diagram shows multiple neurons connected to each other in a
fully connected manner, which is a common arrangement in
neural networks.
HIDDEN LAYER PERCEPTRON
Perceptron Types
1. Single Layer Perceptron model: One of the easiest
ANN(Artifi cial Neural Networks) types consists of a feed-
forward network and includes a threshold transfer inside the
model. The main objective of the single-layer perceptron model
is to analyze the linearly separable objects with binary outcomes.
A Single-layer perceptron can learn only linearly separable
patterns.

2. Multi-Layered Perceptron model: It is mainly similar to a


single-layer perceptron model but has more hidden layers.
Advantages of MLP:
1. A multi-layered perceptron model can solve complex non-
linear problems.
2. It works well with both small and large input data.
3. Helps us to obtain quick predictions after the training.
4. Helps us obtain the same accuracy ratio with big and small
data.
• Complex Pattern Learning: MLPs can learn complex patterns
and relationships in data due to their multiple hidden layers.
This makes them suitable for tasks that involve non-linear and
intricate mappings between inputs and outputs.

• Feature Learning: MLPs can automatically extract hierarchical


and abstract features from raw data, reducing the need for
manual feature engineering. This is especially valuable for
tasks with high-dimensional data, such as images and text.
• Flexibility: MLPs can be applied to various types of data,
including numerical, categorical, and sequential data. They can
handle both regression and classification problems.

• Parallel Processing: MLPs can process multiple data points in


parallel, which can lead to efficient training and prediction
times on modern hardware, such as GPUs.

• Non-Linear Activation: MLPs use non-linear activation


functions, such as ReLU, sigmoid, and tanh, which enable
them to capture complex and non-linear relationships within
the data.
• Transfer Learning: Pre-trained MLPs, especially those used as
feature extractors in deep learning models, can be fine-tuned
for specific tasks, saving time and resources in model
development.

• Ensemble Learning: Multiple MLPs can be combined to create


ensemble models, enhancing performance through diversity of
predictions.

• Availability of Frameworks: Popular deep learning


frameworks like TensorFlow and PyTorch provide easy-to-use
tools for building and training MLPs, making them accessible
to researchers and practitioners.
Disadvantages of MLP :
1. In multi-layered perceptron model, computations are time-
consuming and complex.
2. It is tough to predict how much the dependent variable affects
each independent variable.
3. The model functioning depends on the quality of training
Characteristics of the Perceptron Model
1. It is a machine learning algorithm that uses supervised learning of binary
classifiers.
2. In Perceptron, the weight coefficient is automatically learned.
3. Initially, weights are multiplied with input features, and then the decision is
made whether the neuron is fired or not.
4. The activation function applies a step rule to check whether the function is
more significant than zero.
5. The linear decision boundary is drawn, enabling the distinction between the
two linearly separable classes +1 and -1.
6. If the added sum of all input values is more than the threshold value, it must
have an outputsignal; otherwise, no output will be shown.
Limitation of a Perceptron model
The following are the limitation of a Perceptron model:

1. The output of a perceptron can only be a binary number


(0 or 1) due to the hard-edge transfer function.

2. It can only be used to classify the linearly separable sets of


input vectors. If the input vectors are non-linear, it is not easy to
classify them correctly.
Multi-Layer perceptron
• Multi-Layer perceptron defines the most complex architecture
of artificial neural networks. It is substantially formed from
multiple layers of the perceptron.
• TensorFlow is a very popular deep learning framework
released by, and this notebook will guide to build a neural
network with this library. If we want to understand what is a
Multi-layer perceptron, we have to develop a multi-layer
perceptron from scratch using Numpy.
The pictorial representation of multi-layer
perceptron learning is as shown below-
• MLP networks are used for supervised learning format. A
typical learning algorithm for MLP networks is also called
back propagation's algorithm.
• A multilayer perceptron (MLP) is a feed forward artificial
neural network that generates a set of outputs from a set of
inputs. An MLP is characterized by several layers of input
nodes connected as a directed graph between the input and
output layers. MLP uses backpropagation for training the
network. MLP is a deep learning method.
The training proceeds in two phases:

1. In the forward phase, the synaptic weights of the


network are fixed and the input signal is propagated
through the network, layer by layer, until it reaches the
output. Thus, in this phase, changes are confined to the
activation potentials and outputs of the neurons in the
network.
2. In the backward phase, an error signal is produced by
comparing the output of the network with a desired response. The
resulting error signal is propagated through the network, again
layer by layer, but this time the propagation is performed in the
backward direction. In this second phase, successive adjustments
are made to the synaptic weights of the network. Calculation of
the adjustments for the output layer is straightforward, but it is
much more challenging for the hidden layers.
TensorFlow APIs
• C API for TensorFlow

The only APIs having the official backing of TensorFlow are C and Python
API (some parts). C APIs should be used whenever we are about to make
TensorFlow API for some other languages, as lots of languages have ways to
connect with C language.

• C++ API for TensorFlow

The runtime of TensorFlow is written in C++, and mostly C++ is connected to


TensorFlow through header files in tensor flow. C++ API still is in
experimental stages of development, but Google commits to work with C++.
APIs Outside TensorFlow Project
• TFLearn: This API cannot be seen as TF Learn,
which is Tensor Flow's tf.contrib.learn. It is a type of
separate Python package.
• Tensor Layer: It comes as a separate package and is
different from what Tensor Flow's layers API has in
its bag.
• Pretty Tensor: It is a Google project which offers a
fluent interface with chaining.
• Sonnet: It is a project of Google?s Deep Mind which
features a modular approach.
• COCO is a large-scale object detection,
segmentation, and captioning dataset.
COCO has several features:
• COCO (Microsoft Common Objects in Context)
• Introduced by Lin et al. in Microsoft COCO: Common Objects in Context
• The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale
object detection, segmentation, key-point detection, and captioning dataset. The
dataset consists of 328K images.

• Splits: The first version of MS COCO dataset was released in 2014. It contains
164K images split into training (83K), validation (41K) and test (41K) sets. In 2015
additional test set of 81K images was released, including all the previous test
images and 40K new images.

• Based on community feedback, in 2017 the training/validation split was changed


from 83K/41K to 118K/5K. The new split uses the same images and annotations.
The 2017 test set is a subset of 41K images of the 2015 test set. Additionally, the
2017 release contains a new unannotated dataset of 123K images.
LABELIMG TOOL FOR IMAGE
ANNOTATIONS
• LabelImg is an open-source graphical image annotation tool used for labeling
objects in images. It is commonly used in machine learning and computer vision
projects to create labeled datasets for training and testing object detection
algorithms. LabelImg provides a user-friendly interface that allows annotators to
draw bounding boxes around objects of interest in images and assign labels to those
boxes.
Here's a basic overview of how LabelImg works:
1 ) Installation :
LabelImg is usually installed on your local machine. It is written in Python and uses the
Qt framework for its graphical user interface. You can install it using the following
steps:
• - Install the required packages:
• ```
• pip install pyqt5 lxml
• ```
• - Clone the LabelImg repository from GitHub:
• ```
• git clone https://github.com/tzutalin/labelImg.git
• ```

• - Navigate to the `labelImg` directory and run the application:


• ```
• cd labelImg
• python labelImg.py
• ```
• 2. Annotation Process:
• - Upon launching the application, you can open an image from your local storage.
• - You can then use the drawing tools to draw bounding boxes around the objects
you want to label in the image.
• - After drawing the bounding box, you can assign a label to it, such as "car,"
"dog," "person," etc.
• - You can also specify additional information, such as the object's pose,
truncatedness, and difficulty, if needed.
• - Annotations are saved in an XML format using the Pascal VOC format or
YOLO format, which can be used for training machine learning models.

• 3. Saving and Exporting Annotations:


• - Once you've completed annotating an image, you can save the annotations to an
XML file.
• - LabelImg also allows you to save the annotated images with overlayed bounding
boxes for visual verification.
• 4. Supported Formats:
• - LabelImg supports various image formats, including JPEG, PNG, and
more.
• - Additionally, it supports both the Pascal VOC and YOLO annotation
formats, which are widely used for object detection tasks.

• LabelImg is a popular choice for simple annotation tasks and is especially


useful for small-scale projects or getting started with object detection.
However, for larger and more complex labeling projects, you might want to
consider more advanced tools or platforms that offer features like
collaboration, quality control, and efficient management of annotated data.
Steps involved in using the TensorFlow
Object Detection API:
• TensorFlow's Object Detection API is a powerful tool for building
and training object detection models using the TensorFlow
framework. The TensorFlow Object Detection API makes it easier to
train and deploy models for tasks like object detection, instance
segmentation, and more. Steps involved in using the TensorFlow
Object Detection API are -

• Installation and Setup:


• Install TensorFlow and other required dependencies.
• Clone the TensorFlow Models repository from GitHub, which
contains the Object Detection API:

• git clone https://github.com/tensorflow/models.git


• Data Preparation:
• Collect and label your training and evaluation images
using annotation tools like LabelImg.
• Convert the annotations to the required TFRecord format,
which is used for training.

• Configuration:
• Define the model architecture and training parameters in
a configuration file. The configuration file specifies the
base model, anchor settings, training schedule,
augmentation options, and more.
• Model Training:

• Use the TensorFlow Object Detection API to train your


custom model using the prepared data and configuration.
• During training, the API will adjust the model's weights to
minimize the detection loss.

• Model Evaluation:
• After training, evaluate your model's performance on a
separate evaluation dataset using metrics like mAP
(mean average precision).
• Inference and Deployment:

• Export the trained model checkpoint to a format that can


be used for inference, such as a frozen inference graph.
• Use the exported model for object detection tasks on
new images or videos

You might also like