AML Lecture1.3
AML Lecture1.3
Learning with
TensorFlow
22TCSE532
Name - Bavalpreet Singh
Email - Lecture_1.3
[email protected]
What is TensorFlow?
Definition: TensorFlow is an open-source machine learning platform ● Performance:
developed by Google. ○ Optimized for GPU and distributed computing
○ Supports parallel processing for faster computations
Core concept: It uses data flow graphs for numerical computations. ○ Offers both eager execution and graph execution modes
● Reshaping: tf.reshape()
● Transposing: tf.transpose()
● Concatenation: tf.concat()
● Slicing: tensor[start:end]
TensorFlow Data Types
a) Numeric Types:
b) String Type:
Here’s how you can define operations with both dynamic and static graph execution in TensorFlow 2.x:
Explanation:
Conclusion:
3. Control Output Range: Activation functions can squash the output to a specific
range, making it easier to handle and interpret, especially in tasks like classification.
Types of Activation Functions:
● ReLU: tf.nn.relu()
● Sigmoid: tf.nn.sigmoid()
● Tanh: tf.nn.tanh()
● Softmax: tf.nn.softmax()
Softma
x
Building Neural Networks with Multiple
Layers
Explanation:
● Sequential API: Use tf.keras.Sequential to stack layers in a linear manner. This API is simple and straightforward,
making it easy to build models by specifying a list of layers.
● Dense Layers: Define fully connected layers with tf.keras.layers.Dense. Each Dense layer specifies the number of
neurons and the activation function.
Explanation:
● Layer Management: Add layers incrementally to a tf.keras.Sequential model. This approach provides more control
and flexibility, allowing you to modify the model structure more dynamically.
● Model Summary: Use model.summary() to display the model architecture and parameters. The output will be the
same as in the first example, but the layers are added one at a time.
Working with Multiple Layers:
● Building Neural Networks: Use tf.keras.Sequential to build neural networks with multiple layers. This API
simplifies the process by allowing you to specify layers in a list or add them incrementally.
● Layer Management: Incrementally adding layers can provide more flexibility and control over the model
structure. It is useful when you need to dynamically change the model architecture.
Implementing Loss Functions
Mean Squared Error (MSE): Measures the average squared difference between the predicted and actual
values.
The example shows the loss for two samples, each belonging to different
classes.
Custom Loss Functions: Define a custom loss function by creating a function that takes true labels and
predictions as input and returns a scalar loss value.
The custom loss function calculates the Mean Absolute Error (MAE) between the predicted and true
values. The output shows the average absolute difference between the predicted and true values.
Understanding Backpropagation
Neural network training is about finding weights that minimize prediction error. We usually start our training with a
set of randomly generated weights.Then, backpropagation is used to update the weights in an attempt to correctly
Our initial weights will be as following: w1 = 0.11, w2 = 0.21, w3 = 0.12, w4 = 0.08, w5 = 0.14 and w6 = 0.15
Weights
Our dataset has one sample with two inputs and one output.
Forward Pass
Now, it’s time to find out how our network performed by calculating the difference between the
actual output and predicted one. It’s clear that our network output, or prediction, is not even close
to actual output. We can calculate the difference or the error as following.
Calculating
Error
Our main goal of the training is to reduce the error or the difference between prediction and actual
output. Since actual output is constant, “not changing”, the only way to reduce the error is to
change prediction value. The question now is, how to change prediction value?
By decomposing prediction into its basic elements we can find that weights are the variable
elements affecting prediction value. In other words, in order to change prediction value, we need to
Reducing Error
The question now is how to change\update the weights
value so that the error is reduced?
The question now is how to change\update the weights
value so that the error is reduced?
Similarly, we can derive the update formula for w5 and any other weights existing
between the output and the hidden layer
However, when moving backward to update w1, w2, w3 and w4 existing between input
and hidden layer, the partial derivative for the error function with respect to w1, for
example, will be as following.
We can find the update formula for the remaining weights w2, w3 and w4 in the same way.
Learning rate: is a hyperparameter which means that we need to manually guess its value.
Now, using the new weights we will repeat the forward passed
Backward Pass
We can notice that the prediction 0.26 is a little bit closer to actual
output than the previously predicted one 0.191. We can repeat the same
process of backward and forward pass until error is close or equal to
zero.
Stochastic and Mini-Batch Gradient
Descent
Stochastic Gradient Descent (SGD): Mini-Batch Gradient Descent:
● Definition: Uses a single training example per iteration to ● Definition: Uses a small random subset (mini-batch) of training
update the parameters. examples per iteration.
● Example: ● Example:
○ Each iteration updates the parameters based on ○ Each iteration updates the parameters based on the
one randomly chosen training example (x(i),y(i)). average gradient of the mini-batch.
● Advantages: ● Advantages:
○ Faster convergence for large datasets. ○ Balances the efficiency of batch gradient descent and
○ Allows for real-time or online learning. the speed of SGD.
● Disadvantages: ○ Reduces the noise in parameter updates compared to
○ More noise in updates, leading to higher variance SGD.
in the parameter updates. ● Disadvantages:
● Use Case: Suitable for online learning and large datasets ○ Requires careful tuning of the mini-batch size.
where the full dataset cannot fit into memory. ● Use Case: Commonly used in practice due to a good balance
between performance and computational efficiency.
Implementing batch and stochastic
training in TensorFlow
https://colab.research.google.com/drive/1584uILesPmhe9
Eh6kd7wnnj3CrLMSVHr?usp=sharing
Model Evaluation Metrics
Common Model Evaluation Metrics:
1. Accuracy:
○ Measures the proportion of correctly classified instances out of all
instances.
○ Suitable for balanced datasets.
2. Precision:
○ Measures the proportion of true positive predictions (correctly predicted
positive instances) among all positive predictions.
○ Helps in understanding the reliability of positive predictions.
3. Recall (Sensitivity):
○ Measures the proportion of true positive predictions among all actual
positive instances in the dataset.
○ Indicates how well the model captures positive instances.
4. F1 Score:
○ Harmonic mean of precision and recall. Provides a balance between
precision and recall.
○ Useful when there is an uneven class distribution.
Accuracy Precision
Recall
● What it means: How many total ● What it means: Of all the times the
● What it means: Of all the actual
predictions were correct (both model said “yes” (positive), how many
positives, how many did the model
positives and negatives). were actually correct.
● ● correctly identify.
Use it when: The data is balanced Use it when: False positives are
● Use it when: Missing positives is
(equal positives and negatives). costly.
● ● costly.
Example: If you have a model Example: In spam detection, if
● Example: In disease detection, you
predicting whether an email is spam, marking a good email as spam (false
don’t want to miss a positive case
and there are roughly equal spam positive) is a big problem, you want
(false negative), so high recall is
and non-spam emails, accuracy is a high precision. This means fewer
important to ensure most sick
good measure. non-spam emails are misclassified.
patients are identified.
F1-Score
Unit tests are crucial in software development, including TensorFlow code, for several reasons:
1. Verification of Functionality: Unit tests verify that individual units (functions, methods, or classes) of your code behave as
expected. In TensorFlow, this could mean checking that specific layers, models, or operations produce correct outputs for
given inputs.
2. Early Detection of Bugs: Writing tests helps catch bugs early in the development process, making debugging easier and
reducing the likelihood of encountering issues later on.
3. Maintainability: Tests serve as documentation and specification for your code. They make it easier for new developers to
understand the expected behavior of functions or modules without needing to delve into the implementation details.
4. Refactoring Confidence: When refactoring or modifying existing code, unit tests ensure that changes do not break
existing functionality. They act as a safety net, providing confidence that the system still works correctly after modifications.
5. Regression Testing: Unit tests form the basis for regression testing, ensuring that previously fixed bugs do not reappear
in subsequent versions of your code.
How to write Unit Tests?
● Isolate Tests: Ensure each test is independent and does not rely on the
state of other tests.
● Use Mocking: When testing TensorFlow models, you may need to mock
data or model inputs to simulate different scenarios.
● Coverage: Aim to cover edge cases and different inputs to ensure
robustness.
● Readable Assertions: Use meaningful assertions (assertEqual,
assertGreater, assertLess, etc.) to clearly define expected outcomes.
● Continuous Integration: Integrate unit tests into your development
workflow (e.g., using CI/CD tools) to automate testing and catch issues
early.
Multiple Executors for Distributed Training in
TensorFlow
Distributed training in TensorFlow allows you to leverage multiple GPUs, TPUs, or machines to train large
models faster and more efficiently. TensorFlow supports various distributed training strategies, including
MirroredStrategy, MultiWorkerMirroredStrategy, and TPUStrategy.
Mirrored Strategy
● Each worker (machine) computes gradients on its local batch and then
synchronizes these gradients with other workers.
TPU Strategy
● Data Sharding: Ensure that your data is evenly distributed across the devices to avoid bottlenecks.
● Batch Size: Use a larger batch size to fully utilize the computational resources.
● Synchronization: Properly synchronize the updates across devices to ensure model consistency.
● Monitoring: Use TensorBoard to monitor the training process and adjust parameters as necessary.
Best Practices for Deploying TensorFlow Models
Optimizing a model involves several techniques aimed at reducing its size and improving its
inference speed. Here, we'll cover three key methods: quantization, pruning, and model
compression. Below is an example using TensorFlow.
Quantization
https://www.tensorflow.org/tfx/serving/docker
https://github.com/nfmcclure/tensorflow_cookbook/tree/master
Productionising the Tensorflow models
Productionizing TensorFlow models for deployment in real-world applications involves transforming a
trained model into a fully operational service or application that can handle real-time data, scale, and
provide predictions or insights. This process typically includes several stages, from model preparation to
deployment, monitoring, and maintenance. Here's a breakdown of the key steps involved:
● Data Preparation: Clean, preprocess, and engineer features from your data.
● Model Selection: Use TensorFlow to build and train the machine learning (ML) or deep learning (DL) model.
Common model types include neural networks, CNNs (for images), RNNs (for sequential data), etc.
● Model Evaluation: Evaluate the model's performance on a test set using appropriate metrics (accuracy, loss,
precision, recall, etc.).
● Save Model: Once satisfied with the model, save it using TensorFlow's SavedModel format. This is a
universal format that saves both the architecture and weights for later use.
2. Model Optimization
● Pruning & Quantization: Optimize the model to reduce size and improve
inference speed, especially for deployment on mobile or edge devices.
TensorFlow Lite provides tools for this:
○ Create a Dockerfile to define your environment and steps to run the model.
● APIs for Serving: You can use TensorFlow Serving to serve the model as an API
that can handle REST or gRPC requests. TensorFlow Serving is designed for
scalable and high-performance model serving.
a. Cloud Deployment
● Horizontal Scaling: As demand grows, you may need to scale the service to handle more
requests. This can be done using container orchestration tools like Kubernetes, or cloud
services such as AWS Elastic Beanstalk or Google Kubernetes Engine.
● Load Balancing: Implement load balancers to distribute incoming requests evenly across
multiple instances of your model service.
● Automation: Set up CI/CD pipelines to ● Model Drift Monitoring: Over time, the
automate the process of deploying updated model's performance may degrade due to
models. Whenever a new model version is changes in real-world data (data drift). You
trained, the pipeline will automatically package, should monitor this and retrain the model when
test, and deploy the model. necessary.
○ GitLab CI, Jenkins, CircleCI: These
tools can automate model deployment ● Retraining: Set up pipelines for continuous
steps after successful training and retraining of the model using new data,
evaluation. ensuring that the model stays up-to-date.
● A/B Testing: Deploy multiple versions of your ● Logging: Ensure that logs are maintained for
model (e.g., old and new) and split the incoming all predictions made by the model. This helps in
traffic between them to see which version auditing and improving the model.
performs better in production.
Key Components
This process ensures that TensorFlow models are effectively integrated into real-world
applications, able to scale, and monitored for long-term performance.