0% found this document useful (0 votes)

81 views

BERT - Ipynb - Colaboratory

This document summarizes the steps taken to build and train a BERT model for text classification on a SMS spam detection dataset. The key steps include: 1. Preprocessing the SMS spam dataset and splitting it into train, validation and test sets. 2. Loading the pre-trained BERT model and defining a classification architecture with additional layers. 3. Tokenizing and encoding the train, validation and test texts using the BERT tokenizer. 4. Defining the model, loss function, optimizer and training loop to fine-tune the BERT model on the SMS spam detection task for 20 epochs.

Uploaded by

mehtakinjalb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views

BERT - Ipynb - Colaboratory

Uploaded by

mehtakinjalb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

11/22/23, 4:13 PM BERT.

ipynb - Colaboratory

Subrata Jana week 10 Assignment BERT Model

!pip install transformers

import numpy as np
import pandas as pd
import torch
import torch.nn as nn
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import transformers
from transformers import AutoModel, BertTokenizerFast

# specify GPU
#device = torch.device("cuda")

from google.colab import files

uploaded = files.upload()

Choose Files No file chosen Upload widget is only available when the cell has been
executed in the current browser session. Please rerun this cell to enable.
Saving spam.csv to spam.csv

df = pd.read_csv("spam.csv", encoding = 'latin-1')

df.head()

Unnamed: Unnamed: Unnamed:

v1 v2
2 3 4

0 ham Go until jurong point, crazy.. Available only ... NaN NaN NaN

1 ham Ok lar... Joking wif u oni... NaN NaN NaN

Free entry in 2 a wkly comp to win FA Cup

2 spam NaN NaN NaN
fina...

U dun say so early hor... U c already then

3 ham NaN NaN NaN
say...

df.dropna(how="any", inplace=True, axis=1)

df.columns = ['label', 'message']
df.head()

label message

0 ham Go until jurong point, crazy.. Available only ...

1 ham Ok lar... Joking wif u oni...

2 spam Free entry in 2 a wkly comp to win FA Cup fina...

3 ham U dun say so early hor... U c already then say...

4 ham Nah I don't think he goes to usf, he lives aro...

df['label_num']=df.label.map({'ham':0,'spam':1})
df.head()

label message label_num

0 ham Go until jurong point, crazy.. Available only ... 0

1 ham Ok lar... Joking wif u oni... 0

2 spam Free entry in 2 a wkly comp to win FA Cup fina... 1

3 ham U dun say so early hor... U c already then say... 0

4 ham Nah I don't think he goes to usf, he lives aro... 0

# check class distribution

df['label_num'].value_counts(normalize = True)

0 0.865937
1 0.134063
Name: label_num, dtype: float64

https://colab.research.google.com/drive/1EOYF-YXlpoImo-EU8X-N1pNuYAtj27Rw#scrollTo=bjCduruPRkq6&printMode=true 1/6
11/22/23, 4:13 PM BERT.ipynb - Colaboratory
# split train dataset into train, validation and test sets
train_text, temp_text, train_labels, temp_labels = train_test_split(df['message'], df['label_num'],
random_state=2018,
test_size=0.3,
stratify=df['label_num'])

val_text, test_text, val_labels, test_labels = train_test_split(temp_text, temp_labels,

random_state=2018,
test_size=0.5,
stratify=temp_labels)

# import BERT-base pretrained model

bert = AutoModel.from_pretrained('bert-base-uncased')

# Load the BERT tokenizer

tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')

config.json: 100% 570/570 [00:00<00:00, 12.4kB/s]

model.safetensors: 440M/440M [00:03<00:00,

100% 113MB/s]

tokenizer_config.json: 28.0/28.0 [00:00<00:00,

100% 585B/s]

vocab txt: 100% 232k/232k [00:00<00:00 3 66MB/s]

# get length of all the messages in the train set

seq_len = [len(i.split()) for i in train_text]

pd.Series(seq_len).hist(bins = 30)

<Axes: >

https://colab.research.google.com/drive/1EOYF-YXlpoImo-EU8X-N1pNuYAtj27Rw#scrollTo=bjCduruPRkq6&printMode=true 2/6
11/22/23, 4:13 PM BERT.ipynb - Colaboratory
# tokenize and encode sequences in the training set
tokens_train = tokenizer.batch_encode_plus(
train_text.tolist(),
max_length = 25,
pad_to_max_length=True,
truncation=True
)

# tokenize and encode sequences in the validation set

tokens_val = tokenizer.batch_encode_plus(
val_text.tolist(),
max_length = 25,
pad_to_max_length=True,
truncation=True
)

# tokenize and encode sequences in the test set

tokens_test = tokenizer.batch_encode_plus(
test_text.tolist(),
max_length = 25,
pad_to_max_length=True,
truncation=True
)

train_seq = torch.tensor(tokens_train['input_ids'])
train_mask = torch.tensor(tokens_train['attention_mask'])
train_y = torch.tensor(train_labels.tolist())

val_seq = torch.tensor(tokens_val['input_ids'])
val_mask = torch.tensor(tokens_val['attention_mask'])
val_y = torch.tensor(val_labels.tolist())

test_seq = torch.tensor(tokens_test['input_ids'])
test_mask = torch.tensor(tokens_test['attention_mask'])
test_y = torch.tensor(test_labels.tolist())

from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler

#define a batch size

batch_size = 32

# wrap tensors
train_data = TensorDataset(train_seq, train_mask, train_y)

# sampler for sampling the data during training

train_sampler = RandomSampler(train_data)

# dataLoader for train set

train_dataloader = DataLoader(train_data, sampler=train_sampler, batch_size=batch_size)

# wrap tensors
val_data = TensorDataset(val_seq, val_mask, val_y)

# sampler for sampling the data during training

val_sampler = SequentialSampler(val_data)

# dataLoader for validation set

val_dataloader = DataLoader(val_data, sampler = val_sampler, batch_size=batch_size)

# freeze all the parameters

for param in bert.parameters():
param.requires_grad = False

https://colab.research.google.com/drive/1EOYF-YXlpoImo-EU8X-N1pNuYAtj27Rw#scrollTo=bjCduruPRkq6&printMode=true 3/6
11/22/23, 4:13 PM BERT.ipynb - Colaboratory
class BERT_Arch(nn.Module):

def init(self, bert):

super(BERT_Arch, self).__init__()

self.bert = bert

# dropout layer
self.dropout = nn.Dropout(0.1)

# relu activation function

self.relu = nn.ReLU()

# dense layer 1
self.fc1 = nn.Linear(768,512)

# dense layer 2 (Output layer)

self.fc2 = nn.Linear(512,2)

#softmax activation function

self.softmax = nn.LogSoftmax(dim=1)

#define the forward pass

def forward(self, sent_id, mask):

#pass the inputs to the model

_, cls_hs = self.bert(sent_id, attention_mask=mask, return_dict=False)

x = self.fc1(cls_hs)

x = self.relu(x)

x = self.dropout(x)

# output layer
x = self.fc2(x)

# apply softmax activation

x = self.softmax(x)

return x

# pass the pre-trained BERT to our define architecture

model = BERT_Arch(bert)

# optimizer from hugging face transformers

from transformers import AdamW

# define the optimizer

optimizer = AdamW(model.parameters(),lr = 1e-5)

from sklearn.utils.class_weight import compute_class_weight

y = train_labels
classes=np.unique(y)

#compute the class weights

class_weights = compute_class_weight('balanced', classes = classes, y = y)

print('Class Weights:',class_weights)

/usr/local/lib/python3.10/dist-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated

warnings.warn(
Class Weights: [0.57743559 3.72848948]

https://colab.research.google.com/drive/1EOYF-YXlpoImo-EU8X-N1pNuYAtj27Rw#scrollTo=bjCduruPRkq6&printMode=true 4/6
11/22/23, 4:13 PM BERT.ipynb - Colaboratory
# converting list of class weights to a tensor
weights= torch.tensor(class_weights,dtype=torch.float)

# push to GPU
#weights = weights.to(device)

# define the loss function

cross_entropy = nn.NLLLoss(weight=weights)

# number of training epochs

epochs = 20

# function to train the model

def train():

model.train()
total_loss, total_accuracy = 0, 0

# empty list to save model predictions

total_preds=[]

# iterate over batches

for step,batch in enumerate(train_dataloader):

# progress update after every 50 batches.

if step % 50 == 0 and not step == 0:
print(' Batch {:>5,} of {:>5,}.'.format(step, len(train_dataloader)))

# push the batch to gpu

#batch = [r.to(device) for r in batch]

sent_id, mask, labels = batch

# clear previously calculated gradients

model.zero_grad()

# get model predictions for the current batch

preds = model(sent_id, mask)

# compute the loss between actual and predicted values

loss = cross_entropy(preds, labels)

# add on to the total loss

total_loss = total_loss + loss.item()

# backward pass to calculate the gradients

loss.backward()
# clip the the gradients to 1.0. It helps in preventing the exploding gradient problem
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)

# update parameters
optimizer.step()

# model predictions are stored on GPU. So, push it to CPU

preds=preds.detach().cpu().numpy()

# append the model predictions

total_preds.append(preds)

# compute the training loss of the epoch

avg_loss = total_loss / len(train_dataloader)
# predictions are in the form of (no. of batches, size of batch, no. of classes).
# reshape the predictions in form of (number of samples, no. of classes)
total_preds = np.concatenate(total_preds, axis=0)

#returns the loss and predictions

return avg_loss, total_preds

https://colab.research.google.com/drive/1EOYF-YXlpoImo-EU8X-N1pNuYAtj27Rw#scrollTo=bjCduruPRkq6&printMode=true 5/6
11/22/23, 4:13 PM BERT.ipynb - Colaboratory
# function for evaluating the model
def evaluate():

print("\nEvaluating...")

# deactivate dropout layers

model.eval()

total_loss, total_accuracy = 0, 0

# empty list to save the model predictions

total_preds = []

# iterate over batches

for step,batch in enumerate(val_dataloader):

# Progress update every 50 batches.

if step % 50 == 0 and not step == 0:

# Calculate elapsed time in minutes.

elapsed = format_time(time.time() - t0)

# Report progress.
print(' Batch {:>5,} of {:>5,}.'.format(step, len(val_dataloader)))

# push the batch to gpu

# batch = [t.to(device) for t in batch]

sent_id, mask, labels = batch

# deactivate autograd
with torch.no_grad():

# model predictions
preds = model(sent_id, mask)

# compute the validation loss between actual and predicted values

loss = cross_entropy(preds,labels)

total_loss = total_loss + loss.item()

# set initial loss to infinite
preds = preds.detach().cpu().numpy()
best_valid_loss = float('inf')

total_preds.append(preds)
#defining epochs
epochs = 20
# compute the validation loss of the epoch
avg_loss = total_loss / len(val_dataloader)
# empty lists to store training and validation loss of each epoch
train_losses=[]
# reshape the predictions in form of (number of samples, no. of classes)
valid_losses=[]
total_preds = np.concatenate(total_preds, axis=0)
#for each epoch
return avg_loss, total_preds
for epoch in range(epochs):

print('\n Epoch {:} / {:}'.format(epoch + 1, epochs))

#train model
train_loss, _ = train()

#evaluate model
valid_loss, _ = evaluate()

#save the best model

if valid_loss < best_valid_loss:
best_valid_loss = valid_loss
torch.save(model.state_dict(), 'saved_weights.pt')

# append training and validation loss

train_losses.append(train_loss)
valid_losses.append(valid_loss)

print(f'\nTraining Loss: {train_loss:.3f}')

print(f'Validation Loss: {valid_loss:.3f}')

#load weights of best model

path = 'saved_weights.pt'
model.load_state_dict(torch.load(path))

https://colab.research.google.com/drive/1EOYF-YXlpoImo-EU8X-N1pNuYAtj27Rw#scrollTo=bjCduruPRkq6&printMode=true 6/6

Denr Free Patent Application Form
100% (3)
Denr Free Patent Application Form
3 pages
2024 - A High Accuracy Six-Dimensional Motion Measuring Device Design and Accuracy Evaluation
No ratings yet
2024 - A High Accuracy Six-Dimensional Motion Measuring Device Design and Accuracy Evaluation
19 pages
BERT - Assignment - Jupyter Notebook
0% (2)
BERT - Assignment - Jupyter Notebook
8 pages
HW - Regex: 1 Instructions HW - Regular Expression - 10 Points
No ratings yet
HW - Regex: 1 Instructions HW - Regular Expression - 10 Points
9 pages
Itdumpsfree: Get Free Valid Exam Dumps and Pass Your Exam Test With Confidence
No ratings yet
Itdumpsfree: Get Free Valid Exam Dumps and Pass Your Exam Test With Confidence
5 pages
Medical Tourism in India
No ratings yet
Medical Tourism in India
23 pages
Lifting Plan: Section A - Work Request
No ratings yet
Lifting Plan: Section A - Work Request
3 pages
Men's Luxury Designer Luxury Wallets Men's Leather Wallets GUCCI® US
No ratings yet
Men's Luxury Designer Luxury Wallets Men's Leather Wallets GUCCI® US
1 page
NLP Exercise 10
No ratings yet
NLP Exercise 10
6 pages
NLP Assignment 2024
No ratings yet
NLP Assignment 2024
12 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Sentiment Analysis On Tweets
No ratings yet
Sentiment Analysis On Tweets
2 pages
Transformer - Ipynb - Colab
No ratings yet
Transformer - Ipynb - Colab
5 pages
Fake News Classification - Ipynb - Colaboratory
No ratings yet
Fake News Classification - Ipynb - Colaboratory
6 pages
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
Bert T
No ratings yet
Bert T
2 pages
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
Assignment 1
No ratings yet
Assignment 1
4 pages
FineTune OPUS MT Engine
No ratings yet
FineTune OPUS MT Engine
9 pages
Script
No ratings yet
Script
5 pages
Retorno 1
No ratings yet
Retorno 1
29 pages
ML2025Spring HW2 Public
No ratings yet
ML2025Spring HW2 Public
39 pages
L2_Interacting_with_a_CSV_Data
No ratings yet
L2_Interacting_with_a_CSV_Data
4 pages
3-Sentiment Analysis BERT
No ratings yet
3-Sentiment Analysis BERT
5 pages
Lab Manual -NNDL
No ratings yet
Lab Manual -NNDL
63 pages
bert-for-token-classification-ner-tutorial
No ratings yet
bert-for-token-classification-ner-tutorial
30 pages
Activity 4 CGPA Vs Placement Package Program
No ratings yet
Activity 4 CGPA Vs Placement Package Program
4 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Bulba Advanced Instructions
No ratings yet
Bulba Advanced Instructions
13 pages
SocrAI Day 3
No ratings yet
SocrAI Day 3
43 pages
Worksheer 2 PDF
No ratings yet
Worksheer 2 PDF
36 pages
Bert Fine Tuning (AutoRecovered)
No ratings yet
Bert Fine Tuning (AutoRecovered)
6 pages
Guia Advance
No ratings yet
Guia Advance
10 pages
Assignment: Machine Learning Engineer: Problem Description 1 (NLP)
No ratings yet
Assignment: Machine Learning Engineer: Problem Description 1 (NLP)
1 page
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Day 10 of Mastering LLMs_ Tokenizers
No ratings yet
Day 10 of Mastering LLMs_ Tokenizers
10 pages
NLP Assignment 2
No ratings yet
NLP Assignment 2
3 pages
ML%20PROJECT%20PROPOSAL.pdf
No ratings yet
ML%20PROJECT%20PROPOSAL.pdf
4 pages
PMT2 21
No ratings yet
PMT2 21
39 pages
IRT Lab Programs
No ratings yet
IRT Lab Programs
9 pages
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
PGI20S02J - LAB RECORD (3)
No ratings yet
PGI20S02J - LAB RECORD (3)
24 pages
proj6
No ratings yet
proj6
3 pages
The Illustrated BERT, ELMo, and Co. (How NLP Cracked Transfer Learning) - Jay Alammar - Visualizing Machine Learning One Concept at A Time
No ratings yet
The Illustrated BERT, ELMo, and Co. (How NLP Cracked Transfer Learning) - Jay Alammar - Visualizing Machine Learning One Concept at A Time
20 pages
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Artificial Neural Network Code
No ratings yet
Artificial Neural Network Code
3 pages
Exp 10 Sentiment Analysis BERT
No ratings yet
Exp 10 Sentiment Analysis BERT
5 pages
Ernest wk1
No ratings yet
Ernest wk1
25 pages
BERT (Bidirectional Encoder Represe
No ratings yet
BERT (Bidirectional Encoder Represe
1 page
Experiment 1 solution
No ratings yet
Experiment 1 solution
5 pages
Python-Deprecated Library v1.1 Documentation
From Everand
Python-Deprecated Library v1.1 Documentation
Laurent LAPORTE
No ratings yet
Automate ChatGPT Prompts for Data Science With Python_ Enhanced Coding for the Modern Python Developer
No ratings yet
Automate ChatGPT Prompts for Data Science With Python_ Enhanced Coding for the Modern Python Developer
96 pages
Hugging Face
No ratings yet
Hugging Face
1 page
Assignment 05.ipynb
No ratings yet
Assignment 05.ipynb
21 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
PHASE 2 IBM
No ratings yet
PHASE 2 IBM
5 pages
REG NO. 18MIS7099 Machine Learning - Lab - 10 Name: Dana Vamsi Krishna
No ratings yet
REG NO. 18MIS7099 Machine Learning - Lab - 10 Name: Dana Vamsi Krishna
5 pages
Effects of Batches - Jupyter Notebook
No ratings yet
Effects of Batches - Jupyter Notebook
73 pages
NLP
No ratings yet
NLP
45 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
CSE472_Assignment_2
No ratings yet
CSE472_Assignment_2
3 pages
hybridmodel with cnn modifications
No ratings yet
hybridmodel with cnn modifications
5 pages
ML Lab 01 Manual - Intro To Python
No ratings yet
ML Lab 01 Manual - Intro To Python
9 pages
1729401471516
No ratings yet
1729401471516
98 pages
Practical Fie AI Class 10
No ratings yet
Practical Fie AI Class 10
19 pages
An 476 e 00
No ratings yet
An 476 e 00
49 pages
As Setia Trading (Statement)
No ratings yet
As Setia Trading (Statement)
6 pages
Producer Letter of Agreement
100% (1)
Producer Letter of Agreement
3 pages
Manual Database Creation
No ratings yet
Manual Database Creation
4 pages
Mock 3
No ratings yet
Mock 3
6 pages
3PPP Certification Guide Acknowledgements
No ratings yet
3PPP Certification Guide Acknowledgements
9 pages
SP-G Operational Manual
No ratings yet
SP-G Operational Manual
102 pages
Spoit Config 221-24
No ratings yet
Spoit Config 221-24
4 pages
5 15kv VCP WG Instruction Booklet
No ratings yet
5 15kv VCP WG Instruction Booklet
58 pages
Corporate Corruption in India
No ratings yet
Corporate Corruption in India
16 pages
SD Schuenemann Atp I 02 07
No ratings yet
SD Schuenemann Atp I 02 07
6 pages
Domestication of Media and Technology 1st Edition Thomas Berker download
100% (1)
Domestication of Media and Technology 1st Edition Thomas Berker download
58 pages
Samsung Plasma TV Tips
100% (3)
Samsung Plasma TV Tips
75 pages
Demand Estimation and Forecasting
100% (1)
Demand Estimation and Forecasting
30 pages
101 Exemple Postari Pe Facebook PDF
No ratings yet
101 Exemple Postari Pe Facebook PDF
30 pages
Nagpur Metro Case Study
No ratings yet
Nagpur Metro Case Study
4 pages
Heat-Stable Salts and Amine Unit Performance: Ralph Weiland
No ratings yet
Heat-Stable Salts and Amine Unit Performance: Ralph Weiland
4 pages
Shaft Couplings: Types, Working, Diagram, Advantages and Applications
No ratings yet
Shaft Couplings: Types, Working, Diagram, Advantages and Applications
20 pages
Derivatives Regulation Lecture Daniele 2024
No ratings yet
Derivatives Regulation Lecture Daniele 2024
58 pages
Book Review: "Facing Mount Kenya" by Jomo Kenyatta
No ratings yet
Book Review: "Facing Mount Kenya" by Jomo Kenyatta
2 pages
CS195 Test 5
No ratings yet
CS195 Test 5
4 pages
Economic Engineering
No ratings yet
Economic Engineering
5 pages
Data Structures Lab
No ratings yet
Data Structures Lab
24 pages
SiemensWhitePaper SoftwareBasedValidation
No ratings yet
SiemensWhitePaper SoftwareBasedValidation
7 pages

BERT - Ipynb - Colaboratory

Uploaded by

BERT - Ipynb - Colaboratory

Uploaded by

11/22/23, 4:13 PM BERT.

Subrata Jana week 10 Assignment BERT Model

!pip install transformers

from google.colab import files

df = pd.read_csv("spam.csv", encoding = 'latin-1')

Unnamed: Unnamed: Unnamed:

1 ham Ok lar... Joking wif u oni... NaN NaN NaN

Free entry in 2 a wkly comp to win FA Cup

U dun say so early hor... U c already then

df.dropna(how="any", inplace=True, axis=1)

0 ham Go until jurong point, crazy.. Available only ...

1 ham Ok lar... Joking wif u oni...

2 spam Free entry in 2 a wkly comp to win FA Cup fina...

3 ham U dun say so early hor... U c already then say...

4 ham Nah I don't think he goes to usf, he lives aro...

label message label_num

0 ham Go until jurong point, crazy.. Available only ... 0

1 ham Ok lar... Joking wif u oni... 0

2 spam Free entry in 2 a wkly comp to win FA Cup fina... 1

3 ham U dun say so early hor... U c already then say... 0

4 ham Nah I don't think he goes to usf, he lives aro... 0

# check class distribution

val_text, test_text, val_labels, test_labels = train_test_split(temp_text, temp_labels,

# import BERT-base pretrained model

# Load the BERT tokenizer

config.json: 100% 570/570 [00:00<00:00, 12.4kB/s]

model.safetensors: 440M/440M [00:03<00:00,

tokenizer_config.json: 28.0/28.0 [00:00<00:00,

vocab txt: 100% 232k/232k [00:00<00:00 3 66MB/s]

# get length of all the messages in the train set

# tokenize and encode sequences in the validation set

# tokenize and encode sequences in the test set

from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler

#define a batch size

# sampler for sampling the data during training

# dataLoader for train set

# sampler for sampling the data during training

# dataLoader for validation set

# freeze all the parameters

def __init__(self, bert):

# relu activation function

# dense layer 2 (Output layer)

#softmax activation function

#define the forward pass

#pass the inputs to the model

# apply softmax activation

# pass the pre-trained BERT to our define architecture

# optimizer from hugging face transformers

# define the optimizer

from sklearn.utils.class_weight import compute_class_weight

#compute the class weights

/usr/local/lib/python3.10/dist-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated

# define the loss function

# number of training epochs

# function to train the model

# empty list to save model predictions

# iterate over batches

# progress update after every 50 batches.

# push the batch to gpu

sent_id, mask, labels = batch

# clear previously calculated gradients

# get model predictions for the current batch

# compute the loss between actual and predicted values

# add on to the total loss

# backward pass to calculate the gradients

# model predictions are stored on GPU. So, push it to CPU

# append the model predictions

# compute the training loss of the epoch

#returns the loss and predictions

# deactivate dropout layers

# empty list to save the model predictions

# iterate over batches

# Progress update every 50 batches.

# Calculate elapsed time in minutes.

# push the batch to gpu

sent_id, mask, labels = batch

def init(self, bert):