0% found this document useful (0 votes)
14 views8 pages

Deep DL Manual Nainish

Uploaded by

Solanki umesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

Deep DL Manual Nainish

Uploaded by

Solanki umesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

COMPUTER SCIENCE AND ENGINEERING

FACULTY OF ENGINEERING AND TECHNOLOGY


DL-NLP (203105477) B.Tech. 7th SEM

AIM:Use the Keras

PRACTICAL-6
Deep Learning Library and Write a Code for Encoding
With (One_Hot).
IMPLEMENTATION:

from nltk.corpus import stopwords


from tensorflow.keras.preprocessing.text import one_hot, text_to_word_sequence
# Our text document
text = "This text is used here just for demonstration purposes only; I am STUDENT from Parul
University Vadodara." # Document tokenization words = text_to_word_sequence(text) # One-
hot encode each word in the text vocabulary_size = len(set(words)) one_hot_encoded =
[one_hot(word, vocabulary_size) for word in words] print("\nOne-Hot Encoded Words:")
print(one_hot_encoded)

OUTPUT:

Enrolment No.: 210305124021 8


COMPUTER SCIENCE AND ENGINEERING
FACULTY OF ENGINEERING AND TECHNOLOGY
DL-NLP (203105477) B.Tech. 7th SEM

AIM:Use the Keras

PRACTICAL-7
deep learning library and write a code for Hash
Encoding with (hashing_trick).
IMPLEMENTATION:

from nltk.corpus import stopwords


from tensorflow.keras.preprocessing.text import hashing_trick

# Sample text data documents


=[
"Keras is a deep learning library",
"Hash encoding is useful for large vocabularies",
"Using Keras and TensorFlow for deep learning tasks",
"Hashing trick is a great feature in Keras"
]

# Parameters for the hashing trick


n_dimensions = 10 # Define the number of hash space dimensions

# Apply the hashing trick to each document


hash_encoded_docs = [hashing_trick(doc, n_dimensions, hash_function='md5') for doc in
documents]

# Print the hash-encoded results for i, encoded_doc in


enumerate(hash_encoded_docs):
print(f"Document {i+1} - Hash Encoded: {encoded_doc}")

OUTPUT:

Enrolment No.: 210305124021 9


COMPUTER SCIENCE AND ENGINEERING
FACULTY OF ENGINEERING AND TECHNOLOGY
DL-NLP (203105477) B.Tech. 7th SEM

AIM:Use the Keras

PRACTICAL-8
deep learning library give a demo of Tokenizer API.

IMPLEMENTATION:

from nltk.corpus import stopwords import


keras

from tensorflow.keras.preprocessing.text import Tokenizer


tokenizer = Tokenizer() texts = [
"This is random text I am STUDENT from Parul University Vadodara.",
"And here comes the second one.",
"Finally, this is the third sentence."
]
tokenizer.fit_on_texts(texts)
# Convert a list of texts to a sequence of integers sequences
= tokenizer.texts_to_sequences(texts)

# Print the sequences


print(sequences)

output:

Enrolment No.: 210305124021 10


COMPUTER SCIENCE AND ENGINEERING
FACULTY OF ENGINEERING AND TECHNOLOGY
DL-NLP (203105477) B.Tech. 7th SEM

PRACTICAL-9
AIM:Experiment to do Sentimental Analysis on any of the Dataset (like twitter
tweets, movie review)Sentiment Analysis Using a Movie Review Dataset with Keras
and TensorFlow
In this experiment, we'll perform Sentiment Analysis using the IMDb Movie Reviews dataset
provided by Keras. We'll build a basic deep learning model using LSTM (Long Short-Term
Memory) to classify movie reviews as positive or negative.

Steps Involved:
1. Load the Dataset: Use the IMDb dataset provided by Keras.

2. Preprocess the Data: Tokenize the text and pad sequences.


3. Build the Model: Use an LSTM layer to process the sequences.
4. Train the Model: Train the model using the preprocessed data.
5. Evaluate the Model: Evaluate performance on the test data.

Requirements:
Make sure you have TensorFlow installed: pip install tensorflow keras Explanation:

1. Dataset:
o We use the IMDb movie reviews dataset, which is a binary classification dataset with
reviews labeled as either positive (1) or negative (0).
2. Preprocessing:
o We limit the number of unique words to 10,000 (num_words=10000).
o All reviews are padded or truncated to have exactly 100 words (pad_sequences).
3. Model: o The model has three layers:

▪ Embedding Layer: Converts input words into dense vectors of fixed size.
▪ LSTM Layer: Processes the sequence of word embeddings.
▪ Dense Layer: Outputs the classification result (positive/negative).
4. Training: o We train the model for 3 epochs using a batch size of 32.

5. Evaluation: o The model is evaluated on the test dataset, and the accuracy is
printed.

Enrolment No.: 210305124021 11


COMPUTER SCIENCE AND ENGINEERING
FACULTY OF ENGINEERING AND TECHNOLOGY
DL-NLP (203105477) B.Tech. 7th SEM

How It Works:
• The dataset contains 50,000 movie reviews, split evenly into training and test sets.
• The LSTM network processes the sequence of word embeddings to predict whether a review
is positive or negative.
• The model achieves around 83-85% accuracy after 3 epochs, which can be improved by fine-
tuning parameters or using more sophisticated models.
Further Improvements:
• Use Bidirectional LSTM for better context understanding.
• Experiment with different neural network architectures (e.g., GRU, CNN).
• Apply techniques like Dropout and Regularization to avoid overfitting.

IMPLEMENTATION:

Python Program

import tensorflow as tf
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense

# Step 1: Load the IMDb Movie Review Dataset


max_features = 10000 # Number of words to consider as features (vocabulary size) maxlen

= 100 # Cut off texts after this number of words (max sequence length) embedding_dim
= 128 # Embedding vector size
# Load data and split into training and test sets
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
# Step 2: Preprocess the Data (Pad sequences to have equal length) x_train

= pad_sequences(x_train, maxlen=maxlen)

x_test = pad_sequences(x_test, maxlen=maxlen)

Enrolment No.: 210305124021 12


COMPUTER SCIENCE AND ENGINEERING
FACULTY OF ENGINEERING AND TECHNOLOGY
DL-NLP (203105477) B.Tech. 7th SEM

# Step 3: Build the LSTM Model model


= Sequential()

model.add(Embedding(max_features, embedding_dim, input_length=maxlen))


model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2)) # LSTM layer
model.add(Dense(1, activation='sigmoid')) # Output layer for binary classification (positive/negative)
# Step 4: Compile the Model model.compile(loss='binary_crossentropy',

optimizer='adam', metrics=['accuracy’])

# Step 5: Train the Model


batch_size = 32 epochs =
3

model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test))


# Step 6: Evaluate the Model
score, accuracy = model.evaluate(x_test, y_test, batch_size=batch_size) print(f"Test
accuracy: {accuracy * 100:.2f}%")

Output:

Enrolment No.: 210305124021 13


COMPUTER SCIENCE AND ENGINEERING
FACULTY OF ENGINEERING AND TECHNOLOGY
DL-NLP (203105477) B.Tech. 7th SEM

PRACTICAL-10
AIM:Perform an experiment using Gensim python library for Word2Vec
Embedding.

IMPLEMENTATION:

from gensim.models import Word2Vec


from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
import nltk
nltk.download('punkt')
data = [
"Hello I am a student of from Parul University Vadodara.",
"Word embeddings are useful for natural language processing.", "Gensim
is a popular library for word embeddings.", ]

tokenized_data = [word_tokenize(sentence.lower()) for sentence in data]


model = Word2Vec(sentences=tokenized_data, vector_size=100, window=5,
min_count=1, sg=0)
vector = model.wv['word']
similar_words = model.wv.most_similar('word', topn=5)
model.save('word2vec_model')
loaded_model = Word2Vec.load('word2vec_model') loaded_vector =
loaded_model.wv['word'] loaded_similar_words =
loaded_model.wv.most_similar('word', topn=5) print("Vector for
'word':\n", vector) print("Similar words to
'word':\n", similar_words) print("Vector for 'word'
from loaded model:\n", loaded_vector) print("Similar words to
'word' from loaded model:\n",
loaded_similar_words)

OUTPUT:

Enrolment No.: 210305124021 14


COMPUTER SCIENCE AND ENGINEERING
FACULTY OF ENGINEERING AND TECHNOLOGY
DL-NLP (203105477) B.Tech. 7th SEM

Enrolment No.: 210305124021 15

You might also like