0% found this document useful (0 votes)
2 views19 pages

DL Experiments

The document outlines a project on Deep Learning conducted by students at Netaji Subhas University of Technology, detailing five experiments on various neural network architectures including BackPropagation, RMSProp, CNN, RNN, and U-Net. Each experiment includes theoretical foundations, practical implementations with code, and expected outputs. The focus is on applying these techniques to tasks such as classification, time series prediction, and image segmentation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views19 pages

DL Experiments

The document outlines a project on Deep Learning conducted by students at Netaji Subhas University of Technology, detailing five experiments on various neural network architectures including BackPropagation, RMSProp, CNN, RNN, and U-Net. Each experiment includes theoretical foundations, practical implementations with code, and expected outputs. The focus is on applying these techniques to tasks such as classification, time series prediction, and image segmentation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Deep Learning

EAEPE20
PROJECT FILE
2024-2025

Submitted by ​ Submitted to: Dr Manisha Khulbe


Name: Pratyusha Chaturvedi
Sarthak Chalia

Roll No.: 2022UEA6586


2022UEA6619

Branch: ECAM
Section: 2

Netaji Subhas University of Technology


Geeta Colony
New Delhi-110031
Table Of Contents

Sno.
TITLE DATE SIGN
1. Experiment 1: BackPropagation

2. Experiment 2: RMSProp(Root
Mean Square Propagation)

3. Experiment 3: CNN
(Convolutional Neural Network) for
MNIST Classification

4. Experiment 4: RNN (Recurrent


Neural Network) for Time Series
Prediction.

5. Experiment 5: U-Net for Image


Segmentation
Experiment 1
TOPIC: BackPropagation

THEORY:
Backpropagation (backward propagation of errors) is a fundamental

algorithm used for training artificial neural networks. It's a supervised

learning method that calculates the gradient of the loss function with respect

to the weights in the network.

Key Concepts:

1.​ Neural Network Basics:


○​ Composed of layers (input, hidden, output)
○​ Each neuron has weights and a bias
○​ Activation functions introduce non-linearity
2.​ Forward Pass:
○​ Input data flows through the network
○​ Each layer applies weights, sums inputs, and applies activation
○​ Final output is compared to target using loss function
3.​ Backward Pass:
○​ Error is propagated backward through the network
○​ Chain rule of calculus is used to compute gradients
○​ Weights are updated to minimize the error
4.​ Mathematical Foundation:
○​ For a neuron with activation function σ, input x, weights w, and
bias b:

z = w·x + b

a = σ(z)

○​ Chain rule for gradient calculation:

∂L/∂w = ∂L/∂a * ∂a/∂z * ∂z/∂w

Simple Experiment:
Let's implement a simple neural network with one hidden layer to learn the

XOR function.

XOR Problem

The XOR function is a classic problem that a single perceptron cannot solve,

requiring at least one hidden layer.


CODE:

import numpy as np

# Sigmoid activation function and its derivative


def sigmoid(x):
return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
return x * (1 - x)

# Input dataset (XOR)


X = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])

# Output dataset
y = np.array([[0],
[1],
[1],
[0]])

# Initialize weights randomly with mean 0


np.random.seed(1)
hidden_weights = 2 * np.random.random((2, 2)) - 1
output_weights = 2 * np.random.random((2, 1)) - 1

# Hyperparameters
learning_rate = 0.1
epochs = 10000

# Training loop
for epoch in range(epochs):
# Forward pass
hidden_layer_input = np.dot(X, hidden_weights)
hidden_layer_output = sigmoid(hidden_layer_input)

output_layer_input = np.dot(hidden_layer_output, output_weights)


predicted_output = sigmoid(output_layer_input)
# Calculate error
error = y - predicted_output

# Backpropagation
# Output layer error
output_delta = error * sigmoid_derivative(predicted_output)

# Hidden layer error


hidden_error = output_delta.dot(output_weights.T)
hidden_delta = hidden_error * sigmoid_derivative(hidden_layer_output)

# Update weights
output_weights += hidden_layer_output.T.dot(output_delta) *
learning_rate
hidden_weights += X.T.dot(hidden_delta) * learning_rate

# Print error every 1000 epochs


if epoch % 1000 == 0:
print(f"Epoch {epoch}, Error: {np.mean(np.abs(error))}")

# Test the network


print("\nFinal predictions:")
hidden_layer_input = np.dot(X, hidden_weights)
hidden_layer_output = sigmoid(hidden_layer_input)
output_layer_input = np.dot(hidden_layer_output, output_weights)
predicted_output = sigmoid(output_layer_input)
print(predicted_output)

OUTPUT:
Experiment 2
TOPIC: RMSProp(Root Mean Square Propagation)

THEORY: ​
RMSprop is an adaptive learning rate optimization algorithm designed to
address the limitations of vanilla Stochastic Gradient Descent (SGD). It
adjusts the learning rate for each parameter based on the magnitude of
recent gradients.
Why Use RMSprop?

●​ Solves the problem of vanishing/exploding gradients in deep networks.


●​ Works well for non-convex optimization (common in deep learning).
●​ Improves convergence speed compared to SGD.
Mathematical Formulation:

Simple Experiment:
●​ RMSprop (Root Mean Square Propagation) adapts the learning rate

per parameter.
●​ It divides the gradient by a running average of its recent magnitude.

●​ Helps in faster convergence, especially for non-convex optimization.

CODE:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import RMSprop

# Generate synthetic data


X = np.random.rand(1000, 10)
y = np.random.randint(0, 2, (1000, 1))

# Define a simple neural network


model = Sequential([
Dense(64, activation='relu', input_shape=(10,)),
Dense(32, activation='relu'),
Dense(1, activation='sigmoid')
])

# Use RMSprop optimizer


optimizer = RMSprop(learning_rate=0.001, rho=0.9)

model.compile(optimizer=optimizer, loss='binary_crossentropy',
metrics=['accuracy'])

# Train the model


history = model.fit(X, y, epochs=50, batch_size=32,
validation_split=0.2)

# Plot training history


import matplotlib.pyplot as plt
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend()
plt.show()

OUTPUT:
Experiment 3
TOPIC: CNN (Convolutional Neural Network) for MNIST Classification

THEORY: ​
CNNs are designed for grid-structured data (images, videos) using local
connectivity and weight sharing to detect spatial hierarchies.
Core Components:

1.​ Convolutional Layer:


○​ Applies filters (kernels) to detect features (edges, textures).
○​ Outputs a feature map via cross-correlation:

2.​ Pooling Layer (Max/Average):


○​ Reduces spatial dimensions (translation invariance).
○​ Example (Max Pooling):
○​ Output=max⁡(Window)
○​ Output=max(Window)
3.​ Flatten + Fully Connected Layers:
○​ Converts feature maps into class probabilities.
Why CNNs Work for Images?​

I. Local receptive fields: Detect small regions (e.g., edges).

II. Weight sharing: Same filter scans entire image (efficient).


III. Hierarchical features: Early layers detect edges → later layers detect

objects.

Limitations:​

I. Computationally expensive for high-resolution images.​

II. Struggles with spatial deformations (addressed by data augmentation).

Simple Experiment:
Train a CNN to recognize handwritten digits (0-9) from the MNIST dataset.
The CNN learns hierarchical features (edges → curves → digits) through
convolutional and pooling layers, achieving high accuracy (~99%).
CODE:

import tensorflow as tf

from tensorflow.keras.datasets import mnist

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Load MNIST dataset

(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(-1, 28, 28, 1) / 255.0

X_test = X_test.reshape(-1, 28, 28, 1) / 255.0

# Define CNN model

model = Sequential([

Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),

MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu'),

MaxPooling2D((2, 2)),

Flatten(),

Dense(128, activation='relu'),

Dense(10, activation='softmax')

])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

# Train the model

history = model.fit(X_train, y_train, epochs=5, batch_size=64,


validation_data=(X_test, y_test))

# Evaluate

test_loss, test_acc = model.evaluate(X_test, y_test)

print(f"Test Accuracy: {test_acc:.4f}")

OUTPUT:
Experiment 4
TOPIC: RNN (Recurrent Neural Network) for Time Series Prediction.

THEORY: ​
RNNs process sequential data (time series, text) by maintaining a hidden
state that captures temporal dependencies.
Mathematical Formulation:

1.​ Hidden State Update:

(where σ is a nonlinearity like tanh or ReLU)

2.​ Output:

Problems with Vanilla RNNs:

●​ Vanishing gradients: Long-term dependencies are hard to learn.


●​ Exploding gradients: Unstable training.
Solution: LSTM (Long Short-Term Memory)

●​ Introduces gates (input, forget, output) to regulate information flow.


●​ Forget Gate: Decides what to discard from cell state.
Applications:​

1.Time-series forecasting.​

2. Natural Language Processing (NLP).​

3. Speech recognition.

Limitations:​

1. Computationally expensive for long sequences.​

2. Replaced by Transformers in many NLP tasks.

Simple Experiment:

Use an LSTM-based RNN to predict future values in a synthetic sine wave

time series.The RNN captures temporal patterns by maintaining hidden

states, demonstrating sequential data modeling.

CODE:
import numpy as np

import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import LSTM, Dense

# Generate synthetic time-series data

def generate_data(n=1000):
t = np.arange(n)

y = np.sin(0.02 * t) + 0.1 * np.random.randn(n)

return y

y = generate_data()

# Prepare sequences

def create_dataset(data, window=10):

X, Y = [], []

for i in range(len(data)-window):

X.append(data[i:i+window])

Y.append(data[i+window])

return np.array(X), np.array(Y)

X, Y = create_dataset(y)

# Reshape for LSTM (samples, timesteps, features)

X = X.reshape(-1, 10, 1)

# Define LSTM model

model = Sequential([

LSTM(50, activation='tanh', input_shape=(10, 1)),

Dense(1)

])

model.compile(optimizer='adam', loss='mse')
# Train

model.fit(X, Y, epochs=20, batch_size=16, validation_split=0.2)

# Predict

predictions = model.predict(X[-10:])

print("Predicted next value:", predictions[-1][0])

OUTPUT:
Experiment 5
TOPIC: U-Net for Image Segmentation

THEORY: ​
Image segmentation assigns a class label to each pixel (e.g., tumor in MRI
scans). U-Net is a CNN architecture with skip connections for precise
localization.
U-Net Architecture:

1.​ Encoder (Downsampling):


○​ Conv + Pooling layers extract high-level features.
2.​ Bottleneck:
○​ Captures the most abstract representations.
3.​ Decoder (Upsampling):
○​ Transposed convolutions (UpSampling2D) recover spatial
resolution.
4.​ Skip Connections:
○​ Concatenates encoder features with decoder (preserves fine
details).
Loss Function:

●​ Binary Cross-Entropy (for 2-class segmentation):

●​ Dice Loss (for imbalanced classes):


Why U-Net?​

1. Works with small datasets (medical imaging).​

2. Preserves spatial details via skip connections.

Limitations:​

1. Requires precise annotations (expensive to label).​

2. Struggles with class imbalance (e.g., small tumors).

Simple Experiment:

Simulate binary segmentation (foreground/background) on synthetic

grayscale images. The U-Net combines downsampling (encoder) and

upsampling (decoder) with skip connections to precisely segment pixel-wise

classes.

CODE:
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D,
UpSampling2D, Concatenate
from tensorflow.keras.models import Model
import numpy as np

# Generate synthetic images and masks


def generate_data(n=100, img_size=(128, 128)):
X = np.random.rand(n, *img_size, 1)
Y = (X > 0.5).astype(np.float32) # Binary masks
return X, Y

X_train, Y_train = generate_data(100)


X_test, Y_test = generate_data(20)
# Define U-Net model
inputs = Input((128, 128, 1))

# Encoder
c1 = Conv2D(16, (3, 3), activation='relu', padding='same')(inputs)
p1 = MaxPooling2D((2, 2))(c1)

c2 = Conv2D(32, (3, 3), activation='relu', padding='same')(p1)


p2 = MaxPooling2D((2, 2))(c2)

# Bottleneck
b = Conv2D(64, (3, 3), activation='relu', padding='same')(p2)

# Decoder
u1 = UpSampling2D((2, 2))(b)
u1 = Concatenate()([u1, c2])
c3 = Conv2D(32, (3, 3), activation='relu', padding='same')(u1)

u2 = UpSampling2D((2, 2))(c3)
u2 = Concatenate()([u2, c1])
c4 = Conv2D(16, (3, 3), activation='relu', padding='same')(u2)

outputs = Conv2D(1, (1, 1), activation='sigmoid')(c4)

model = Model(inputs, outputs)


model.compile(optimizer='adam', loss='binary_crossentropy',
metrics=['accuracy'])

# Train
model.fit(X_train, Y_train, epochs=10, batch_size=8,
validation_data=(X_test, Y_test))

# Predict
pred_mask = model.predict(X_test[0:1])
print("Predicted mask shape:", pred_mask.shape)

OUTPUT:

You might also like