0% found this document useful (0 votes)

11 views18 pages

Assignment 10

The document analyzes the cost efficiency of upgrading from a dual-CPU system to a GPU system for processing images, detailing initial and operational costs, performance metrics, and energy efficiency. It also compares three deep learning frameworks (TensorFlow, PyTorch, Keras) regarding their GPU acceleration capabilities and includes implementations for MNIST digit classification. Additionally, it discusses real-time object detection for autonomous driving, emphasizing the importance of GPU computing in handling high-speed image processing and model training.

Uploaded by

limitlesscosmicai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views18 pages

Assignment 10

Uploaded by

limitlesscosmicai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Assignment 10

NAME-: PRAKSHAL JAIN

ENROLMENT NUMBER – 21102157

ANS-: To compare the cost efficiency of upgrading from a dual-CPU system to a GPU
system for processing 1 million images, we need to consider both the initial hardware costs
and the ongoing operational costs over a period of 1 year.

Step 1: Identify Key Metrics

Cost: Dual-CPU system: $5000 per CPU × 2 = $10,000
GPU system: $30,000

Performance:
Dual-CPU system: 100 images per hour
GPU system: 500 images per hour

Operational Costs:
Let’s assume the following hypothetical operational costs:
Dual-CPU system: $200 per month
GPU system: $400 per month
Step 2: Calculate Time to Process 1 Million Images

Step 3: Calculate Total Operational Costs Over 1 Year

Dual-CPU System:
Operational Costs=12 months×200 USD/month=2,400 USD
GPU System:
Operational Costs=12 months×400 USD/month=4,800 USD

Step 4: Calculate Total Costs Over 1 Year

1. Dual-CPU System:

Total Cost=Initial Cost+Operational Costs=10,000 USD+2,400 USD=12,400 USD

2. GPU System:

Total Cost=Initial Cost+Operational Costs=30,000 USD+4,800 USD=34,800 USD

To calculate the scaling efficiency when using 4 GPUs compared to a single GPU, we follow
these steps:

1. Determine the Speedup: This is the ratio of the time taken with a single GPU to the
time taken with multiple GPUs.
2. Calculate the Scaling Efficiency: This is the speedup divided by the number of
GPUs.
To determine the energy efficiency (in images per watt-hour) for both the GPU-based server
and the CPU-based server, we will follow these steps:

1. Calculate the total energy consumption for each server.

2. Calculate the energy efficiency in terms of images per watt-hour for each server.
3. Compare the energy efficiencies.

 GPU-based server:
o Power consumption: 400 watts
o Processing time: 2 hours
o Images processed: 250,000
 CPU-based server:
o Power consumption: 250 watts
o Processing time: 10 hours
o Images processed: 250,000
Energy consumption (in watt-hours) is calculated by multiplying the power consumption (in
watts) by the time (in hours).

1. GPU-based server:

Energy Consumption=Power×Time=400 watts×2 hours=800 watt-

Energy Consumption=Power×Time=400 watts×2 hours=800 watt-hours

2. CPU-based server:

Energy Consumption=Power×Time=250 watts×10 hours=2500 watt-hours

Energy efficiency is calculated by dividing the number of images processed by the total
energy consumption.

1. GPU-based server:

Energy Efficiency=Images Processed/Energy Consumption=250,000 images/800 watt

-hours=312.5 images per watt-hour.

2. CPU-based server:

Energy Efficiency=Images Processed/Energy Consumption=250,000 images/2500 wa

tt-hours=100 images per watt-hour.

Given Data

 Number of filters: 128

 Filter size: 3×33 \times 33×3
 Input feature map size: 64×64
 Stride: 1
 Padding: None (valid convolution)
 Number of input channels: 64

Dimensions of the Output Feature Map

The formula for the output size of a convolutional layer is given by:
Output Size=Input Size−Filter Size+2×Padding/Stride+1

Since there is no padding and the stride is 1:

Output Size=64−3+2×0/1+1=62.

So, the output feature map size is 62×62.

Number of FLOPs for a Single Convolution Operation

Each filter is applied to a 3×3 region across all 64 channels. Therefore, the number of
operations per filter application is:

FLOPs per filter application=3×3×64=576.

Each convolutional operation involves a multiply and an add (MAC operation), so each filter
application involves twice the number of FLOPs:

Total FLOPs per filter application=2×576=1152.

Number of Output Elements

The output feature map has dimensions 62×62 and there are 128 filters. So, the total number
of output elements is:

62×62×128

Total Number of FLOPs

The total number of FLOPs required to compute the output feature map is:

Total FLOPs=1152×62×62×128

Let's calculate this step-by-step:

1. Calculate 62×6262 \times 6262×62:

62×62=3844

2. Multiply by 128:

3844×128=491,392

3. Multiply by 1152:

491,392×1152=566,231,424.

Conclusion
The total number of floating-point operations (FLOPs) required to compute the output feature
map for the given convolutional layer is 566,231,424.

To provide a comprehensive analysis of how three popular deep learning

frameworks (TensorFlow, PyTorch, and Keras) utilize GPU acceleration, we
will cover the following:

1. Comparative Analysis of Frameworks:

o Introduction to each framework.
o Overview of GPU acceleration in each framework.
o Key features and ease of use.
o Performance and efficiency considerations.
2. Implementation of a Simple Neural Network for MNIST Digit
Classification:
o Source code for each framework using GPU support.
3. Summary of Training Time and Accuracy Results:
o A table comparing the training times and accuracies for each implementation.

1. Comparative Analysis of Frameworks

TensorFlow

Overview:

 TensorFlow, developed by Google, is a highly flexible and comprehensive open-

source platform for machine learning.
 It offers extensive support for both research and production, with capabilities for deep
learning and other ML tasks.

GPU Acceleration:
 TensorFlow provides built-in support for GPU acceleration using CUDA and cuDNN.
 Users can leverage GPUs by simply installing the GPU version of TensorFlow and
setting device contexts in the code.

Key Features:

 Flexible and powerful, suitable for both high-level and low-level operations.
 TensorFlow Hub for reusable pre-trained models.
 TensorFlow Extended (TFX) for production deployment.

Performance:

 TensorFlow is optimized for performance with large-scale ML tasks, including

distributed training.

PyTorch

Overview:

 PyTorch, developed by Facebook's AI Research lab, is known for its dynamic

computation graph, making it more intuitive and flexible.
 It is widely used in both academia and industry for research and development.

GPU Acceleration:

 PyTorch offers seamless GPU acceleration. Tensor operations are easy to move
between CPU and GPU.
 It uses CUDA for GPU support and allows dynamic graph building, which can be
particularly useful for certain applications.

Key Features:

 Dynamic computation graph (eager execution).

 Strong community support and extensive tutorials.
 Integrates well with Python's ecosystem.

Performance:

 PyTorch is designed for flexibility and ease of use, with competitive performance,
especially in research contexts.

Keras

Overview:

 Keras is a high-level neural networks API, written in Python and capable of running
on top of TensorFlow, Theano, and other frameworks.
 It is user-friendly and fast to prototype with.
GPU Acceleration:

 Keras supports GPU acceleration through its backend frameworks, primarily

TensorFlow.
 Users can switch between CPU and GPU by setting the backend appropriately.

Key Features:

 Simple and consistent interface for building neural networks.

 Extensive library of pre-trained models.
 Strong support for prototyping and rapid development.

Performance:

 Keras prioritizes ease of use and rapid development, with performance largely
dependent on the backend used (e.g., TensorFlow).

2. Implementation of a Simple Neural Network for MNIST Digit

Classification

TensorFlow Implementation

import tensorflow as tf

from tensorflow.keras.datasets import mnist

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Flatten

from tensorflow.keras.utils import to_categorical

# Load data

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

y_train, y_test = to_categorical(y_train), to_categorical(y_test)

# Define model

model = Sequential([

Flatten(input_shape=(28, 28)),

Dense(128, activation='relu'),

Dense(10, activation='softmax')
])

# Compile model

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train model

with tf.device('/GPU:0'):

history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

# Evaluate model

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)

print(f"Test accuracy: {test_acc}")

PyTorch Implementation
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms

# Load data
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,),
(0.5,))])
train_dataset = datasets.MNIST('./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST('./data', train=False, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=32, shuffle=False)

# Define model
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.flatten = nn.Flatten()
self.fc1 = nn.Linear(28*28, 128)
self.fc2 = nn.Linear(128, 10)

def forward(self, x):

x = self.flatten(x)
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x

model = Net().cuda()

# Define loss and optimizer

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Train model
for epoch in range(10):
model.train()
for data, target in train_loader:
data, target = data.cuda(), target.cuda()
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()

# Evaluate model
correct = 0
total = 0
model.eval()
with torch.no_grad():
for data, target in test_loader:
data, target = data.cuda(), target.cuda()
output = model(data)
_, predicted = torch.max(output.data, 1)
total += target.size(0)
correct += (predicted == target).sum().item()

test_acc = correct / total

print(f"Test accuracy: {test_acc}")

Keras Implementation
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Flatten

from keras.utils import to_categorical

# Load data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

y_train, y_test = to_categorical(y_train), to_categorical(y_test)

# Define model
model = Sequential([
Flatten(input_shape=(28, 28)),

Dense(128, activation='relu'),
Dense(10, activation='softmax')
])

# Compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train model

history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

# Evaluate model
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f"Test accuracy: {test_acc}")
Comprehensive Report: Real-Time Object Detection for Autonomous
Driving Using GPU Computing

1. Introduction

Autonomous driving is one of the most transformative technologies of the

modern era. At its core, it involves enabling vehicles to navigate and operate
without human intervention. A critical component of autonomous driving is
real-time object detection, which allows the vehicle to recognize and respond to
various objects and obstacles on the road, such as other vehicles, pedestrians,
traffic signs, and more. This report delves into the importance of GPU
computing in solving this problem and implements a simplified version of a
deep learning model using GPU acceleration to demonstrate its efficacy.

2. Problem Description

Real-Time Object Detection in Autonomous Driving

Autonomous vehicles must process a vast amount of visual data in real time to
detect and classify objects accurately. The challenge lies in the need for high-
speed processing to ensure safety and reliability. Traditional CPU-based
systems struggle with the computational demands of real-time object detection
due to their limited parallel processing capabilities.

Challenges:

 High-speed image processing

 Accurate object detection and classification
 Handling diverse and dynamic environments
 Ensuring safety and reliability

3. Role of GPU Computing

Why GPU Computing is Crucial

1. Parallel Processing Capabilities: GPUs are designed to handle thousands

of simultaneous threads, making them ideal for the parallel nature of
deep learning computations.
2. Speed: GPUs significantly reduce the time required to train and infer
deep learning models, which is essential for real-time applications.
3. Efficiency: Handling large-scale data and complex models is more
efficient with GPUs, enabling faster and more accurate object detection.
4. Implementation

Simplified Model for Real-Time Object Detection

We will implement a simplified version of the YOLO (You Only Look Once)
model for object detection. YOLO is known for its speed and accuracy, making
it suitable for real-time applications.

4.1 Data Preparation

For simplicity, we'll use a subset of a well-known object detection dataset like
COCO (Common Objects in Context).

python
Copy code
import tensorflow as tf
from tensorflow.keras.preprocessing.image import
ImageDataGenerator

# Assuming dataset is already downloaded and

preprocessed
# Load and preprocess data
datagen = ImageDataGenerator(rescale=1./255,
validation_split=0.2)

train_generator = datagen.flow_from_directory(
'data/train', target_size=(416, 416),
batch_size=32, class_mode='categorical',
subset='training'
)

val_generator = datagen.flow_from_directory(
'data/val', target_size=(416, 416),
batch_size=32, class_mode='categorical',
subset='validation'
)
4.2 Model Definition

We will define a simplified version of the YOLO model.

python
Copy code
from tensorflow.keras.layers import Conv2D, Input,
BatchNormalization, LeakyReLU, ZeroPadding2D
from tensorflow.keras.models import Model

def yolo_body(inputs, num_anchors, num_classes):

x = Conv2D(32, (3,3), padding='same',
use_bias=False)(inputs)
x = BatchNormalization()(x)
x = LeakyReLU(alpha=0.1)(x)

# Add more layers as needed to match the

simplified YOLO architecture
x = Conv2D(num_anchors * (num_classes + 5),
(1,1), padding='same', use_bias=False)(x)

return Model(inputs, x)

inputs = Input(shape=(416, 416, 3))

model = yolo_body(inputs, num_anchors=3,
num_classes=80)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Train the model with GPU acceleration

with tf.device('/GPU:0'):
history = model.fit(train_generator, epochs=10,
validation_data=val_generator)
5. Performance Evaluation

Training Time and Accuracy

python
Copy code
# Evaluate the model
test_loss, test_acc = model.evaluate(val_generator,
verbose=2)
print(f"Validation accuracy: {test_acc}")

# Summarize performance
import matplotlib.pyplot as plt

# Plot training & validation accuracy values

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

# Plot training & validation loss values

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
6. Impact of GPU Computing

Speed and Efficiency

1. Training Time Reduction: GPU acceleration drastically reduces the

training time from hours (or days) to minutes (or hours), allowing for
quicker model iterations.
2. Scalability: Handling larger datasets and more complex models becomes
feasible, enabling more accurate and robust real-time object detection.
3. Real-Time Processing: With GPUs, real-time processing of video frames
is achievable, which is crucial for autonomous driving.

Example Comparison:

 CPU-based training: 10 hours

 GPU-based training: 1 hour

Inference Speed:

 CPU-based inference: 5 frames per second (fps)

 GPU-based inference: 30 fps

7. Conclusion

GPU computing plays a pivotal role in the advancement of autonomous driving

by enabling real-time object detection. The ability to process vast amounts of
data quickly and accurately not only enhances the capabilities of autonomous
vehicles but also improves safety and reliability. This simplified implementation
demonstrates the significant impact of GPU acceleration, highlighting its
necessity in real-world applications.

Deliverables

 Comprehensive Report: A 6-8 page document detailing the problem, the

role of GPU computing, and the implementation, including performance
evaluation and impact analysis.
 Source Code: Provided above, demonstrating the implementation of a
deep learning model using TensorFlow and GPU acceleration.
 Performance Evaluation and Impact Analysis: Graphs and metrics
illustrating the training time, accuracy, and the benefits of GPU
computing.

The final report should include sections for introduction, problem description,
role of GPU computing, implementation details, performance evaluation,
impact analysis, and conclusion, along with appropriate references and
appendices for the source code.

Deep Learning
No ratings yet
Deep Learning
46 pages
Tensorflow: Features
No ratings yet
Tensorflow: Features
10 pages
Experiment 10 1
No ratings yet
Experiment 10 1
3 pages
DL Mannual For Reference
No ratings yet
DL Mannual For Reference
58 pages
Aditya Joshi 23252595 Assign 5
No ratings yet
Aditya Joshi 23252595 Assign 5
7 pages
Vineela Ann1
No ratings yet
Vineela Ann1
9 pages
NN From Scratch
No ratings yet
NN From Scratch
5 pages
Deep Learning For Vision Lab Manual 2024
100% (1)
Deep Learning For Vision Lab Manual 2024
25 pages
DSE 3141 Deep Learning Lab Manual 2024 Week4
No ratings yet
DSE 3141 Deep Learning Lab Manual 2024 Week4
14 pages
21BCP167 Ai 9
No ratings yet
21BCP167 Ai 9
10 pages
Document 4
No ratings yet
Document 4
2 pages
Keras
No ratings yet
Keras
4 pages
MICCAI Educational Challenge
No ratings yet
MICCAI Educational Challenge
3 pages
DL7 2
No ratings yet
DL7 2
11 pages
Image Classification Using MNIST Dataset
No ratings yet
Image Classification Using MNIST Dataset
28 pages
Deep Learning1
No ratings yet
Deep Learning1
23 pages
This Python Script Implements A Single
No ratings yet
This Python Script Implements A Single
6 pages
Even More Detailed Explination
No ratings yet
Even More Detailed Explination
38 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
Microproject Report Group 2
No ratings yet
Microproject Report Group 2
15 pages
Deep Learning Lab With Output
No ratings yet
Deep Learning Lab With Output
12 pages
PDL Final Assignment-3 Aryan
No ratings yet
PDL Final Assignment-3 Aryan
8 pages
09 Tensorflow101 Slide
No ratings yet
09 Tensorflow101 Slide
78 pages
Week 2
No ratings yet
Week 2
4 pages
EE292A Lecture 2.ML - Hardware - 2 - April9
No ratings yet
EE292A Lecture 2.ML - Hardware - 2 - April9
13 pages
Building Deep Learning Models Using The PyTorch Library
No ratings yet
Building Deep Learning Models Using The PyTorch Library
4 pages
DLV Lab Manual Print
No ratings yet
DLV Lab Manual Print
29 pages
Experiment 10-1
No ratings yet
Experiment 10-1
3 pages
CNN Implementation in Python
No ratings yet
CNN Implementation in Python
7 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
FULLTEXT02
No ratings yet
FULLTEXT02
87 pages
CCS355-Neural Networks and Deep Learning - Assignment 1
No ratings yet
CCS355-Neural Networks and Deep Learning - Assignment 1
15 pages
Ker As Tutorial
No ratings yet
Ker As Tutorial
33 pages
TLM For CNN
No ratings yet
TLM For CNN
32 pages
10 1109@mwscas48704 2020 9184436
No ratings yet
10 1109@mwscas48704 2020 9184436
4 pages
Multi Layer Perceptron Tf2 Code Description
No ratings yet
Multi Layer Perceptron Tf2 Code Description
10 pages
DL Unit 4 Notes
No ratings yet
DL Unit 4 Notes
21 pages
Ug4 Proj
No ratings yet
Ug4 Proj
44 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Introduction To ANN With Steps 10 25
No ratings yet
Introduction To ANN With Steps 10 25
30 pages
ML Lab Session 05 - CNN Implementation
No ratings yet
ML Lab Session 05 - CNN Implementation
4 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
5 pages
ASNM Program Explain
No ratings yet
ASNM Program Explain
4 pages
Deep Learning Experiments
No ratings yet
Deep Learning Experiments
42 pages
Digit Recognizer Using CNN
No ratings yet
Digit Recognizer Using CNN
4 pages
Course Title: Fundamentals of Deep Learning Lab: BTECH Programme: AI&DS
No ratings yet
Course Title: Fundamentals of Deep Learning Lab: BTECH Programme: AI&DS
81 pages
Unit - I CHP - 5
No ratings yet
Unit - I CHP - 5
26 pages
MLOA Exp 1 - C121
No ratings yet
MLOA Exp 1 - C121
18 pages
CS401 24 Assign 2 Template Fixed
No ratings yet
CS401 24 Assign 2 Template Fixed
11 pages
Train Your Image Classifier Model With PyTorch
No ratings yet
Train Your Image Classifier Model With PyTorch
6 pages
Week 6
No ratings yet
Week 6
8 pages
Mamindla Sathvika Lab11
No ratings yet
Mamindla Sathvika Lab11
12 pages
Introduction To Genetic Algorithm Neural Networks
No ratings yet
Introduction To Genetic Algorithm Neural Networks
44 pages
Cs336 Spring2024 Assignment2 Systems
No ratings yet
Cs336 Spring2024 Assignment2 Systems
30 pages
BreastCancer EXP
No ratings yet
BreastCancer EXP
8 pages
Neural DEEP
No ratings yet
Neural DEEP
39 pages
Dlweek 7
No ratings yet
Dlweek 7
9 pages
DL Lab-Final
No ratings yet
DL Lab-Final
22 pages
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
Full Stuck Software Developer
No ratings yet
Full Stuck Software Developer
45 pages
C0805C100K5GACTU
No ratings yet
C0805C100K5GACTU
4 pages
03-Citation and Referencing Guidelines
No ratings yet
03-Citation and Referencing Guidelines
6 pages
Uv-K5 User Manuel
No ratings yet
Uv-K5 User Manuel
55 pages
STAT-231-Statistical Methods
No ratings yet
STAT-231-Statistical Methods
98 pages
Psychological Statistics Assignment
No ratings yet
Psychological Statistics Assignment
4 pages
Microcontrollers
No ratings yet
Microcontrollers
13 pages
Exercises of Nouns
No ratings yet
Exercises of Nouns
5 pages
Time Series Forecasting
100% (1)
Time Series Forecasting
52 pages
Design of High-Speed Comparator For LVDS Receiver
No ratings yet
Design of High-Speed Comparator For LVDS Receiver
3 pages
Appendices: A B C D
No ratings yet
Appendices: A B C D
14 pages
Fast Gradient Attack On Network Embedding
No ratings yet
Fast Gradient Attack On Network Embedding
13 pages
Statistical Tool Iggat Shaira Salinen Ruffa Grace
No ratings yet
Statistical Tool Iggat Shaira Salinen Ruffa Grace
14 pages
Common Fractions Convert To Decimal All PDF
No ratings yet
Common Fractions Convert To Decimal All PDF
20 pages
Third Year Civil Engg. 3rd Year Scheme Syllabus 2018-19 PDF
No ratings yet
Third Year Civil Engg. 3rd Year Scheme Syllabus 2018-19 PDF
24 pages
Managing Emergency Generators With Nonlinear Loads
No ratings yet
Managing Emergency Generators With Nonlinear Loads
8 pages
Irregular Singular Points
No ratings yet
Irregular Singular Points
14 pages
General Physics II
No ratings yet
General Physics II
52 pages
Hbgary Shell Trojan Gens
No ratings yet
Hbgary Shell Trojan Gens
28 pages
PrecisionTree - Debbie House
No ratings yet
PrecisionTree - Debbie House
18 pages
Reactive Power - Obligatory (Synch Gens) v1.1
No ratings yet
Reactive Power - Obligatory (Synch Gens) v1.1
4 pages
W73153 International GCSE Science (Single Award) 4SS0 AN Accessible Version
No ratings yet
W73153 International GCSE Science (Single Award) 4SS0 AN Accessible Version
4 pages
10b LeadTime
No ratings yet
10b LeadTime
2 pages
Various Methods of Ligation Ties: Review Article
No ratings yet
Various Methods of Ligation Ties: Review Article
6 pages
Prysmian MV 1CALX33HD Datasheet 2015-04
No ratings yet
Prysmian MV 1CALX33HD Datasheet 2015-04
2 pages
Flow Characteristics of Skimming Flows in Stepped Channels: I. Ohtsu Y. Yasuda and M. Takahashi
No ratings yet
Flow Characteristics of Skimming Flows in Stepped Channels: I. Ohtsu Y. Yasuda and M. Takahashi
10 pages
Crystal I I Zat I: Zyxwvutsrqponmlkjihgfedcbazyxwvutsrqponmlkjihgfedcba
No ratings yet
Crystal I I Zat I: Zyxwvutsrqponmlkjihgfedcbazyxwvutsrqponmlkjihgfedcba
10 pages
Pick&Place Station Assembly Instructions
No ratings yet
Pick&Place Station Assembly Instructions
20 pages
Math 3
No ratings yet
Math 3
117 pages
Icom Ic-F7000 Service Manual
0% (1)
Icom Ic-F7000 Service Manual
79 pages

Assignment 10

Uploaded by

Assignment 10

Uploaded by

Assignment 10

NAME-: PRAKSHAL JAIN

Step 1: Identify Key Metrics

Step 3: Calculate Total Operational Costs Over 1 Year

Step 4: Calculate Total Costs Over 1 Year

Total Cost=Initial Cost+Operational Costs=10,000 USD+2,400 USD=12,400 USD

Total Cost=Initial Cost+Operational Costs=30,000 USD+4,800 USD=34,800 USD

1. Calculate the total energy consumption for each server.

Energy Consumption=Power×Time=400 watts×2 hours=800 watt-

Energy Consumption=Power×Time=250 watts×10 hours=2500 watt-hours

Energy Efficiency=Images Processed/Energy Consumption=250,000 images/800 watt

Energy Efficiency=Images Processed/Energy Consumption=250,000 images/2500 wa

 Number of filters: 128

Dimensions of the Output Feature Map

Since there is no padding and the stride is 1:

So, the output feature map size is 62×62.

Number of FLOPs for a Single Convolution Operation

FLOPs per filter application=3×3×64=576.

Total FLOPs per filter application=2×576=1152.

Number of Output Elements

Total Number of FLOPs

Let's calculate this step-by-step:

1. Calculate 62×6262 \times 6262×62:

To provide a comprehensive analysis of how three popular deep learning

1. Comparative Analysis of Frameworks:

1. Comparative Analysis of Frameworks

 TensorFlow, developed by Google, is a highly flexible and comprehensive open-

 TensorFlow is optimized for performance with large-scale ML tasks, including

 PyTorch, developed by Facebook's AI Research lab, is known for its dynamic

 Dynamic computation graph (eager execution).

 Keras supports GPU acceleration through its backend frameworks, primarily

 Simple and consistent interface for building neural networks.

2. Implementation of a Simple Neural Network for MNIST Digit

from tensorflow.keras.datasets import mnist

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Flatten

from tensorflow.keras.utils import to_categorical

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

y_train, y_test = to_categorical(y_train), to_categorical(y_test)

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)

print(f"Test accuracy: {test_acc}")

def forward(self, x):

# Define loss and optimizer

test_acc = correct / total

from keras.utils import to_categorical

y_train, y_test = to_categorical(y_train), to_categorical(y_test)

history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

Autonomous driving is one of the most transformative technologies of the

Real-Time Object Detection in Autonomous Driving

 High-speed image processing

3. Role of GPU Computing

Why GPU Computing is Crucial

1. Parallel Processing Capabilities: GPUs are designed to handle thousands

Simplified Model for Real-Time Object Detection

4.1 Data Preparation

# Assuming dataset is already downloaded and

We will define a simplified version of the YOLO model.

def yolo_body(inputs, num_anchors, num_classes):

# Add more layers as needed to match the

inputs = Input(shape=(416, 416, 3))

# Train the model with GPU acceleration

Training Time and Accuracy

# Plot training & validation accuracy values

# Plot training & validation loss values

Speed and Efficiency

1. Training Time Reduction: GPU acceleration drastically reduces the

 CPU-based training: 10 hours

 CPU-based inference: 5 frames per second (fps)

GPU computing plays a pivotal role in the advancement of autonomous driving

 Comprehensive Report: A 6-8 page document detailing the problem, the

You might also like