0% found this document useful (0 votes)
17 views6 pages

Untitled 10

Uploaded by

bhargavianjaneya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views6 pages

Untitled 10

Uploaded by

bhargavianjaneya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

pip install tensorflow-datasets

Requirement already satisfied: tensorflow-datasets in c:\users\bharg\


anaconda3\lib\site-packages (4.9.4)
Requirement already satisfied: absl-py in c:\users\bharg\anaconda3\
lib\site-packages (from tensorflow-datasets) (1.4.0)
Requirement already satisfied: click in c:\users\bharg\anaconda3\lib\
site-packages (from tensorflow-datasets) (8.0.4)
Requirement already satisfied: dm-tree in c:\users\bharg\anaconda3\
lib\site-packages (from tensorflow-datasets) (0.1.8)
Requirement already satisfied: etils[enp,epath,etree]>=0.9.0 in c:\
users\bharg\anaconda3\lib\site-packages (from tensorflow-datasets)
(1.8.0)
Requirement already satisfied: numpy in c:\users\bharg\anaconda3\lib\
site-packages (from tensorflow-datasets) (1.24.3)
Requirement already satisfied: promise in c:\users\bharg\anaconda3\
lib\site-packages (from tensorflow-datasets) (2.3)
Requirement already satisfied: protobuf>=3.20 in c:\users\bharg\
anaconda3\lib\site-packages (from tensorflow-datasets) (3.20.3)
Requirement already satisfied: psutil in c:\users\bharg\anaconda3\lib\
site-packages (from tensorflow-datasets) (5.9.0)
Requirement already satisfied: requests>=2.19.0 in c:\users\bharg\
anaconda3\lib\site-packages (from tensorflow-datasets) (2.31.0)
Requirement already satisfied: tensorflow-metadata in c:\users\bharg\
anaconda3\lib\site-packages (from tensorflow-datasets) (1.14.0)
Requirement already satisfied: termcolor in c:\users\bharg\anaconda3\
lib\site-packages (from tensorflow-datasets) (2.4.0)
Requirement already satisfied: toml in c:\users\bharg\anaconda3\lib\
site-packages (from tensorflow-datasets) (0.10.2)
Requirement already satisfied: tqdm in c:\users\bharg\anaconda3\lib\
site-packages (from tensorflow-datasets) (4.65.0)
Requirement already satisfied: wrapt in c:\users\bharg\anaconda3\lib\
site-packages (from tensorflow-datasets) (1.14.1)
Requirement already satisfied: fsspec in c:\users\bharg\anaconda3\lib\
site-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow-
datasets) (2023.4.0)
Requirement already satisfied: importlib_resources in c:\users\bharg\
anaconda3\lib\site-packages (from etils[enp,epath,etree]>=0.9.0-
>tensorflow-datasets) (6.4.0)
Requirement already satisfied: typing_extensions in c:\users\bharg\
anaconda3\lib\site-packages (from etils[enp,epath,etree]>=0.9.0-
>tensorflow-datasets) (4.7.1)
Requirement already satisfied: zipp in c:\users\bharg\anaconda3\lib\
site-packages (from etils[enp,epath,etree]>=0.9.0->tensorflow-
datasets) (3.11.0)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\
bharg\anaconda3\lib\site-packages (from requests>=2.19.0->tensorflow-
datasets) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\users\bharg\
anaconda3\lib\site-packages (from requests>=2.19.0->tensorflow-
datasets) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\bharg\
anaconda3\lib\site-packages (from requests>=2.19.0->tensorflow-
datasets) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\bharg\
anaconda3\lib\site-packages (from requests>=2.19.0->tensorflow-
datasets) (2023.7.22)
Requirement already satisfied: colorama in c:\users\bharg\anaconda3\
lib\site-packages (from click->tensorflow-datasets) (0.4.6)
Requirement already satisfied: six in c:\users\bharg\anaconda3\lib\
site-packages (from promise->tensorflow-datasets) (1.16.0)
Requirement already satisfied: googleapis-common-protos<2,>=1.52.0 in
c:\users\bharg\anaconda3\lib\site-packages (from tensorflow-metadata-
>tensorflow-datasets) (1.63.0)
Note: you may need to restart the kernel to use updated packages.

import tensorflow as tf
import tensorflow_datasets as tfds

# Load CIFAR-100 dataset


(train_ds, test_ds), info = tfds.load('cifar100', split=['train',
'test'], with_info=True)

# Preprocess the data


def preprocess_data(sample):
image = tf.image.resize(sample['image'], (32, 32)) / 255.0 #
Resize and normalize
label = tf.one_hot(sample['label'], depth=100) # One-hot encode
labels
return image, label

train_ds = train_ds.map(preprocess_data).shuffle(1000).batch(64)
test_ds = test_ds.map(preprocess_data).batch(64)

# Build the model


model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(32, 32, 3)),
tf.keras.layers.MaxPooling2D((2, 2)),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D((2, 2)),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(100, activation='softmax')
])

# Compile the model


model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Train the model


model.fit(train_ds, epochs=10, validation_data=test_ds)

# Evaluate the model


test_loss, test_acc = model.evaluate(test_ds)
print(f'Test accuracy: {test_acc}')

Epoch 1/10

C:\Users\bharg\anaconda3\Lib\site-packages\keras\src\layers\
convolutional\base_conv.py:99: UserWarning: Do not pass an
`input_shape`/`input_dim` argument to a layer. When using Sequential
models, prefer using an `Input(shape)` object as the first layer in
the model instead.
super().__init__(

782/782 ━━━━━━━━━━━━━━━━━━━━ 17s 18ms/step - accuracy: 0.0484 - loss:


4.2915 - val_accuracy: 0.1532 - val_loss: 3.5971
Epoch 2/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 13s 17ms/step - accuracy: 0.1759 - loss:
3.4663 - val_accuracy: 0.2183 - val_loss: 3.2293
Epoch 3/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 13s 17ms/step - accuracy: 0.2337 - loss:
3.1450 - val_accuracy: 0.2553 - val_loss: 3.0379
Epoch 4/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 14s 17ms/step - accuracy: 0.2725 - loss:
2.9479 - val_accuracy: 0.2803 - val_loss: 2.9086
Epoch 5/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 13s 17ms/step - accuracy: 0.3032 - loss:
2.7833 - val_accuracy: 0.2920 - val_loss: 2.8498
Epoch 6/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 13s 17ms/step - accuracy: 0.3274 - loss:
2.6570 - val_accuracy: 0.3246 - val_loss: 2.6955
Epoch 7/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 13s 17ms/step - accuracy: 0.3482 - loss:
2.5506 - val_accuracy: 0.3382 - val_loss: 2.6332
Epoch 8/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 13s 17ms/step - accuracy: 0.3655 - loss:
2.4663 - val_accuracy: 0.3504 - val_loss: 2.5762
Epoch 9/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 13s 17ms/step - accuracy: 0.3859 - loss:
2.3841 - val_accuracy: 0.3444 - val_loss: 2.5976
Epoch 10/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 13s 17ms/step - accuracy: 0.3957 - loss:
2.3275 - val_accuracy: 0.3557 - val_loss: 2.5423
157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 7ms/step - accuracy: 0.3511 - loss:
2.5547
Test accuracy: 0.35569998621940613
Project 2
import pandas as pd
import numpy as np
import re
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix

# Load the dataset


data = pd.read_csv("C:\\Users\\bharg\\Downloads\\archive (2)\\
training.1600000.processed.noemoticon.csv", encoding="ISO-8859-1",
header=None,
names=["target", "ids", "date", "flag", "user",
"text"])

# Drop unnecessary columns


data.drop(["ids", "date", "flag", "user"], axis=1, inplace=True)

# Clean the text data


def clean_text(text):
# Remove URLs, mentions, and hashtags
text = re.sub(r"http\S+|@\S+|#\S+", "", text)
# Remove special characters and punctuation
text = re.sub(r"[^a-zA-Z0-9]", " ", text)
# Convert to lowercase
text = text.lower()
# Remove extra whitespace
text = re.sub(r"\s+", " ", text).strip()
return text

data["text"] = data["text"].apply(clean_text)

# Encode target labels


data["target"] = data["target"].replace({0: 0, 2: 1, 4: 2})

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(data["text"],
data["target"], test_size=0.2, random_state=42)

# Feature extraction using TF-IDF


tfidf_vectorizer = TfidfVectorizer(max_features=5000)
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)
X_test_tfidf = tfidf_vectorizer.transform(X_test)

# Train Logistic Regression model


logistic_reg = LogisticRegression(max_iter=1000)
logistic_reg.fit(X_train_tfidf, y_train)
# Evaluate the model
y_pred = logistic_reg.predict(X_test_tfidf)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
print(classification_report(y_test, y_pred))
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

Accuracy: 0.79061875
precision recall f1-score support

0 0.80 0.78 0.79 159494


2 0.78 0.80 0.79 160506

accuracy 0.79 320000


macro avg 0.79 0.79 0.79 320000
weighted avg 0.79 0.79 0.79 320000

Confusion Matrix:
[[124040 35454]
[ 31548 128958]]

PROJECT3
import pandas as pd
import numpy as np
import re
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer,
TfidfTransformer
from sklearn.naive_bayes import MultinomialNB

# Load the dataset


data = pd.read_csv("C:\\Users\\bharg\\OneDrive\\Desktop\\
train.ft.txt")

# Drop unnecessary columns


data.drop(["Id"], axis=1, inplace=True)

# Clean the text data


def clean_text(text):
# Remove special characters and punctuation
text = re.sub(r"[^a-zA-Z0-9]", " ", text)
# Convert to lowercase
text = text.lower()
# Tokenize the text
tokens = word_tokenize(text)
# Remove stopwords
stop_words = set(stopwords.words('english'))
filtered_tokens = [word for word in tokens if word not in
stop_words]
# Join tokens back into text
clean_text = " ".join(filtered_tokens)
return clean_text

data["Review"] = data["Review"].apply(clean_text)

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(data["Review"],
data["Rating"], test_size=0.2, random_state=42)

# Create a pipeline for text classification


text_clf = Pipeline([
('vect', CountVectorizer()),
('tfidf', TfidfTransformer()),
('clf', MultinomialNB()),
])

# Train the model


text_clf.fit(X_train, y_train)

# Evaluate the model


y_pred = text_clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
print(classification_report(y_test, y_pred))
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

You might also like