0% found this document useful (0 votes)
6 views6 pages

Exercise 8

Rnn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views6 pages

Exercise 8

Rnn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Exercise 8: Build AlexNet using Advanced CNN

AlexNet is a deep convolutional neural network architecture that gained significant attention
and played a crucial role in advancing the field of deep learning and computer vision. It was
developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, and it won the
ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012, marking a
breakthrough in image classification tasks.
Here are some key features and components of AlexNet:
Deep Convolutional Layers: AlexNet consists of eight layers of learnable parameters,
including five convolutional layers and three fully connected layers. The convolutional layers
are designed to automatically learn hierarchical features from input images.
Rectified Linear Units (ReLU): AlexNet uses the rectified linear unit activation function
(ReLU) in its hidden layers. ReLU helps mitigate the vanishing gradient problem and
accelerates convergence during training.
Local Response Normalization: The model incorporates local response normalization
(LRN) after the first and second convolutional layers. This normalization helps improve
generalization by normalizing responses within local receptive fields.
Max-Pooling: Max-pooling layers are used after the first, second, and fifth convolutional
layers to downsample the feature maps and reduce the spatial dimensions.
Large-Scale Dataset: AlexNet was trained on the ImageNet dataset, which contains over a
million images from thousands of categories, making it a large-scale image classification
network.
Dropout: Dropout, a regularization technique, is applied to the fully connected layers to
prevent overfitting.
Softmax Activation: The output layer of AlexNet uses the softmax activation function to
compute class probabilities for image classification tasks.
Parallelism: During training, AlexNet was one of the first models to take advantage of GPU
parallelism, which significantly accelerated the training process.
AlexNet demonstrated the effectiveness of deep convolutional neural networks for image
classification tasks and led to a surge of interest in deep learning research. It laid the
foundation for subsequent CNN architectures like VGG, GoogLeNet (Inception), and
ResNet, which have further improved the state-of-the-art performance on various computer
vision tasks.
IMDB dataset:

The IMDB dataset is a widely recognized benchmark in the field of natural language
processing (NLP), primarily used for sentiment analysis. It consists of 50,000 movie reviews, equally
split between positive and negative sentiments, with 25,000 reviews designated for training and the
remaining 25,000 for testing. Each review is labeled either as positive or negative, making it a binary
classification problem. The reviews vary significantly in length, which adds to the complexity of the
task. To process the textual data, common preprocessing steps include tokenization, removal of stop
words, and converting the text into numerical representations using techniques like Bag of Words,
TF-IDF, or word embeddings such as Word2Vec or GloVe. The IMDB dataset is not only instrumental
in training models for sentiment analysis but also plays a key role in broader text classification tasks,
making it an essential tool for researchers and practitioners alike.
#Import necessary libraries

import tensorflow as tf

from tensorflow.keras.datasets import imdb

from tensorflow.keras.preprocessing import sequence

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Embedding, Conv1D, MaxPooling1D, Flatten,

Dense, Dropout

from tensorflow.keras.optimizers import Adam

# Set random seed for reproducibility

tf.random.set_seed(42)

# Parameters

max_features = 10000 # Number of words to consider as features

max_len = 500 # Max sequence length for each review

embedding_dim = 128 # Embedding dimensions for each word

# Load the IMDB dataset

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

# Pad sequences to ensure uniform input size


x_train = sequence.pad_sequences(x_train, maxlen=max_len)

x_test = sequence.pad_sequences(x_test, maxlen=max_len)

def build_alexnet():
model = Sequential()
# Embedding layer (maps integer-encoded words to dense vectors)
model.add(Embedding(max_features, embedding_dim, input_length=max_len))
# 1st Convolutional Layer
model.add(Conv1D(filters=96, kernel_size=11, strides=1, activation='relu',
padding='same'))
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))
# 2nd Convolutional Layer
model.add(Conv1D(filters=256, kernel_size=5, activation='relu', padding='same'))
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))
# 3rd, 4th, 5th Convolutional Layers
model.add(Conv1D(filters=384, kernel_size=3, activation='relu', padding='same'))
model.add(Conv1D(filters=384, kernel_size=3, activation='relu', padding='same'))
model.add(Conv1D(filters=256, kernel_size=3, activation='relu', padding='same'))
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))

# Flatten layer
model.add(Flatten())
# 1st Fully Connected Layer
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
# 2nd Fully Connected Layer
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
# Output layer (binary classification)
model.add(Dense(1, activation='sigmoid'))
return model

# Initialize the model

model = build_alexnet()

# Compile the model

model.compile(optimizer=Adam(learning_rate=0.0001),

loss='binary_crossentropy',

metrics=['accuracy'])
# Display model architecture

model.summary()

# Train the model

batch_size = 256

epochs =10

history = model.fit(x_train, y_train,

batch_size=batch_size,

epochs=epochs,

validation_data=(x_test, y_test),

verbose=1)

# Training accuracy and loss

train_acc = history.history['accuracy'][-1]

train_loss = history.history['loss'][-1]

print(f'Train Loss: {train_loss:.4f}, Train Accuracy: {train_acc * 100:.2f}%')

# Evaluate the model on the test data

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=1)

print(f'Test Loss: {test_loss:.4f}, Test Accuracy: {test_acc * 100:.2f}%')

# Plot training & validation accuracy values


import matplotlib.pyplot as plt

plt.figure(figsize=(14, 5))

plt.subplot(1, 2, 1)

plt.plot(history.history['accuracy'], label='Train Accuracy')

plt.plot(history.history['val_accuracy'], label='Test Accuracy')

plt.title('Model Accuracy')

plt.ylabel('Accuracy')

plt.xlabel('Epoch')

plt.legend(loc='upper left')

# Plot training & validation loss values

plt.subplot(1, 2, 2)

plt.plot(history.history['loss'], label='Train Loss')

plt.plot(history.history['val_loss'], label='Test Loss')

plt.title('Model Loss')

plt.ylabel('Loss')

plt.xlabel('Epoch')

plt.legend(loc='upper left')

# Display the plots

plt.tight_layout()

plt.show()

You might also like