100% found this document useful (1 vote)

122 views11 pages

Hugging Face

Uploaded by

vishalbobby680

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

122 views11 pages

Hugging Face

Uploaded by

vishalbobby680

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Transfarmers

Hugging Face’s Transformers

library
●
Transformers are a type of neural network architecture introduced in the paper “Attention is All You Need” by
Vaswani et al. in 2017. They are highly efficient for NLP tasks because they can process input sequences in
parallel using a mechanism called self-attention. Unlike traditional recurrent neural networks (RNNs), which
process input sequentially, Transformers model dependencies between all words in a sentence at once, which
allows them to capture long-range dependencies and relationships.

●
Popular models like BERT (Bidirectional Encoder Representations from Transformers), GPT (Generative Pre-
trained Transformer), and DistilBERT are based on the Transformer architecture.


Hugging Face’s Transformers library is a powerful tool for working with Transformer-based models, which are
widely used in natural language processing (NLP) tasks like text classification, sentiment analysis, translation,
and more. Here's an introduction to Transformers and how you can fine-tune pre-trained models such as
DistilBERT for tasks like text classification and sentiment analysis on your custom dataset.

Dr. Ashaq Hussain Bhat

●
A Transformer is a deep learning architecture introduced in the seminal 2017 paper “Attention is
All You Need” by Vaswani et al. The core innovation of the Transformer is the self-attention
mechanism, which allows the model to weigh the importance of different words in a sentence
relative to each other, regardless of their distance. Unlike recurrent neural networks (RNNs) or
long short-term memory networks (LSTMs), which process tokens sequentially, Transformers
process tokens in parallel, making them more efficient for training on large datasets.
Key Models built on the Transformer architecture:
●
BERT (Bidirectional Encoder Representations from Transformers): Designed to understand the
context of a word in relation to the entire sentence, reading text in both directions (left-to-right and
right-to-left).
●
GPT (Generative Pre-trained Transformer): Developed for generating text, it's optimized for tasks
like text completion, summarization, and dialogue generation.
●
DistilBERT: A smaller, faster version of BERT, offering comparable performance with fewer
parameters.

Dr. Ashaq Hussain Bhat

l a ssi fi cation
od el s for Text C
ed M
Pre-train

●
Hugging Face provides a variety of pre-trained models that you can directly use or fine-tune for specific
NLP tasks. A common use case is text classification or sentiment analysis, where you want to assign a
label to a given text.

●
DistilBERT is one of the popular pre-trained models for such tasks. It is a smaller and faster version of
BERT, with 40% fewer parameters while retaining 97% of BERT’s performance. It’s a great choice when
you need efficiency without sacrificing much accuracy.

Dr. Ashaq Hussain Bhat

Pre-trained Models for Text Classification

●
Hugging Face provides a variety of pre-trained models that you can directly use or fine-tune for
specific NLP tasks. A common use case is text classification or sentiment analysis, where you want to
assign a label to a given text.

●
DistilBERT is one of the popular pre-trained models for such tasks. It is a smaller and faster version of
BERT, with 40% fewer parameters while retaining 97% of BERT’s performance. It’s a great choice
when you need efficiency without sacrificing much accuracy.

Dr. Ashaq Hussain Bhat

Using the Transformers Library for Text Classification (distilBERT)
a) Install the Transformers library and required dependencies:
pip install transformers datasets
b) Load a pre-trained model and tokenizer:
The tokenizer is responsible for converting text into numerical format that the model can process, and the model is the pre-trained
Transformer model.
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
from transformers import Trainer, TrainingArguments
from datasets import load_dataset
# Load pre-trained tokenizer and model for sequence classification
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')

“
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased', num_labels=2) # 2 labels for binary classification
c) Preprocess the dataset:
You can use the datasets library from Hugging Face to load and preprocess your custom dataset. You need to tokenize your text data using
the tokenizer.

”
def preprocess_function(examples):
return tokenizer(examples['text'], padding="max_length", truncation=True)
# Example with a custom dataset, you can replace this with your dataset
dataset = load_dataset('csv', data_files={'train': 'train.csv', 'test': 'test.csv'})
# Apply the tokenizer to the dataset
tokenized_datasets = dataset.map(preprocess_function, batched=True)

Dr. Ashaq Hussain Bhat

Working with Custom Datasets
Fine-tuning a pre-trained model requires a labeled dataset specific to your task. This dataset can be in any format (CSV,
JSON, etc.), but it must be tokenized before passing it to the model.
Steps for using a custom dataset:
1. Load the dataset: You can use the Hugging Face datasets library to load your custom dataset or standard datasets (e.g.,
IMDb, SST2 for sentiment analysis).
2. Tokenize the dataset: Use the tokenizer to convert raw text into numerical format that the model can understand.
3. Split the dataset: Typically, the dataset is split into training, validation, and test sets.
Example of loading and tokenizing a dataset:

“
from datasets import load_dataset

# Load custom dataset from CSV

dataset = load_dataset('csv', data_files={'train': 'train.csv', 'test': 'test.csv'})

# Tokenize the dataset

def preprocess_function(examples):
return tokenizer(examples['text'], padding="max_length", truncation=True)

tokenized_datasets = dataset.map(preprocess_function, batched=True)

”
Dr. Ashaq Hussain Bhat
d) Define the training arguments:
You can configure the training process using the TrainingArguments class. This includes settings like the number of
epochs, batch size, learning rate, etc.
training_args = TrainingArguments(
output_dir='./results', # output directory
evaluation_strategy="epoch", # evaluate after each epoch
per_device_train_batch_size=16, # batch size for training
per_device_eval_batch_size=64, # batch size for evaluation
num_train_epochs=3, # number of training epochs
weight_decay=0.01, # strength of weight decay
)

“
e) Initialize the Trainer:
The Trainer class simplifies the training and evaluation loop for most Transformer models.
trainer = Trainer(

”
model=model,
args=training_args,
train_dataset=tokenized_datasets['train'],
eval_dataset=tokenized_datasets['test'],
)

Dr. Ashaq Hussain Bhat

f) Train the model:
Once everything is set up, you can start the training process.
trainer.train()
g) Evaluate the model:
After training, you can evaluate the performance of your model on the test set.
results = trainer.evaluate()
print(results)

“
Sentiment Analysis
The above setup can be used for sentiment analysis by assigning labels like 0 for negative and 1 for positive

”
sentiments in the dataset. After fine-tuning, the model can classify whether a given text has a positive or negative
sentiment.
Fine-Tuning on a Custom Dataset
Fine-tuning a pre-trained model like DistilBERT is effective when you want to adapt the model to a specific task or
domain, such as legal text classification, medical document classification, etc. All you need is a labeled dataset with
the target labels (for example, positive/negative in sentiment analysis).
Dr. Ashaq Hussain Bhat
Fine-tuning Pre-trained Models for Specific Tasks

Pre-trained models like DistilBERT are general-purpose, meaning they are trained on vast amounts of diverse data and can be
adapted (fine-tuned) for specific tasks with relatively few task-specific labeled examples.

Fine-tuning Process:

• Load the pre-trained model and tokenizer: The model and tokenizer are loaded from the Hugging Face Model Hub.

• Prepare your dataset: Your custom dataset needs to be tokenized. This means converting your text into numerical input.

• Training the model: The model is trained (fine-tuned) on your dataset using Hugging Face's Trainer class, which abstracts

“
away the complexity of model training.

• Evaluate the model: After training, the model is evaluated on a test set to check how well it generalizes to unseen data.
The process involves setting training arguments such as learning rate, batch size, number of epochs, and more.

”
Dr. Ashaq Hussain Bhat
Example of fine-tuning DistilBERT for text classification:
from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
trainer = Trainer(
output_dir='./results', # output directory model=model,
evaluation_strategy="epoch", # evaluate after each epoch args=training_args,
train_dataset=tokenized_datasets['train'],

“
per_device_train_batch_size=16, # batch size for training
eval_dataset=tokenized_datasets['test'],
per_device_eval_batch_size=64, # batch size for evaluation )
num_train_epochs=3, # number of training epochs

”
trainer.train() # Fine-tune the model
weight_decay=0.01, # strength of weight decay
)

Dr. Ashaq Hussain Bhat

Hypermodern Python Tooling (For - Claudio Jolowicz
No ratings yet
Hypermodern Python Tooling (For - Claudio Jolowicz
581 pages
The Rise of Vector Databases in The Age of LLMs
No ratings yet
The Rise of Vector Databases in The Age of LLMs
26 pages
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
100% (1)
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
504 pages
Lecture 3 Finetuning Part 1
No ratings yet
Lecture 3 Finetuning Part 1
85 pages
5 Pretraining On Unlabeled Data - Build A Large Language Model (From Scratch)
No ratings yet
5 Pretraining On Unlabeled Data - Build A Large Language Model (From Scratch)
61 pages
LangChain Programming For Beginners
No ratings yet
LangChain Programming For Beginners
154 pages
20 Types of LLM Guardrails
No ratings yet
20 Types of LLM Guardrails
12 pages
GenAI Roadmap
No ratings yet
GenAI Roadmap
8 pages
5 Techiques To FineTune LLMs
No ratings yet
5 Techiques To FineTune LLMs
7 pages
Huggingface Basics
No ratings yet
Huggingface Basics
28 pages
Generative AI
No ratings yet
Generative AI
11 pages
Res Net
No ratings yet
Res Net
13 pages
Introduction - Hugging Face NLP Course
No ratings yet
Introduction - Hugging Face NLP Course
8 pages
Guide To RAG System Evaluation Metrics
No ratings yet
Guide To RAG System Evaluation Metrics
21 pages
NCA-GENL Nvidia Generative Ai Llms Exam Dumps
No ratings yet
NCA-GENL Nvidia Generative Ai Llms Exam Dumps
5 pages
A Review On Large Language Models Architectures Applications Taxonomies Open Issues and Challenges
No ratings yet
A Review On Large Language Models Architectures Applications Taxonomies Open Issues and Challenges
36 pages
Vector Database in LLMs
No ratings yet
Vector Database in LLMs
14 pages
Gen Ai Solutions
No ratings yet
Gen Ai Solutions
14 pages
LangGraph Tutorials
100% (1)
LangGraph Tutorials
3 pages
1 - Optimize Amazon SageMaker Deployment Strategies
No ratings yet
1 - Optimize Amazon SageMaker Deployment Strategies
45 pages
Brittany King Data Scientist Resume
No ratings yet
Brittany King Data Scientist Resume
1 page
Deep Learning and TensorFlow
No ratings yet
Deep Learning and TensorFlow
50 pages
Techniques To FineTune LLMs
No ratings yet
Techniques To FineTune LLMs
7 pages
Generative AI Interview Questions and Answers
No ratings yet
Generative AI Interview Questions and Answers
7 pages
Lab7 LLM Chains
No ratings yet
Lab7 LLM Chains
7 pages
Early Stopping in Practice
No ratings yet
Early Stopping in Practice
14 pages
Generative AI
No ratings yet
Generative AI
25 pages
GenAI Unit1 3
No ratings yet
GenAI Unit1 3
31 pages
Agents in Artificial Intelligence Book
No ratings yet
Agents in Artificial Intelligence Book
29 pages
Generative AI With Large Language Models AWS & DeepLearning
No ratings yet
Generative AI With Large Language Models AWS & DeepLearning
96 pages
Knowledge Graph Construction Using Large Language Models
No ratings yet
Knowledge Graph Construction Using Large Language Models
17 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Building Your Own Autonomous LLM Agents - LinkedIn
No ratings yet
Building Your Own Autonomous LLM Agents - LinkedIn
33 pages
Ian Talks Python A-Z
From Everand
Ian Talks Python A-Z
Ian Eress
No ratings yet
Bedrock Doc 1
No ratings yet
Bedrock Doc 1
4 pages
LangChain Academy - Introduction To LangGraph - Motivation
No ratings yet
LangChain Academy - Introduction To LangGraph - Motivation
17 pages
Fine-Tuning AI Models - A Guide. Fine-Tuning Is A Technique For Adapting - by Prabhu Srivastava - Medium
No ratings yet
Fine-Tuning AI Models - A Guide. Fine-Tuning Is A Technique For Adapting - by Prabhu Srivastava - Medium
12 pages
Knowledge Graphs V Vector Databases and When Not To Use Them!
No ratings yet
Knowledge Graphs V Vector Databases and When Not To Use Them!
3 pages
Fine Tuning Techniques For Large Language Models LLMs
No ratings yet
Fine Tuning Techniques For Large Language Models LLMs
15 pages
Fine-Tuning Legal-BERT - LLMs For Automated Legal Text Classification - by Drewgelbard - Nov, 2024 - Towards AI
No ratings yet
Fine-Tuning Legal-BERT - LLMs For Automated Legal Text Classification - by Drewgelbard - Nov, 2024 - Towards AI
27 pages
AI Institutes
No ratings yet
AI Institutes
98 pages
Pytorch: Tensors and Datasets
No ratings yet
Pytorch: Tensors and Datasets
9 pages
Rakesh Kumar - Data Scientist
No ratings yet
Rakesh Kumar - Data Scientist
3 pages
LangChain & RAG
No ratings yet
LangChain & RAG
62 pages
Ai Notes
No ratings yet
Ai Notes
2 pages
De Mod 5 Deploy Workloads With Databricks Workflows
No ratings yet
De Mod 5 Deploy Workloads With Databricks Workflows
19 pages
Class 10 Book Solution RDBMS
0% (3)
Class 10 Book Solution RDBMS
6 pages
Hugging Face Case Study 112023
No ratings yet
Hugging Face Case Study 112023
2 pages
Hands-On Lab With LLMs and Gen AI Within IDC
No ratings yet
Hands-On Lab With LLMs and Gen AI Within IDC
57 pages
Hugging Face Transformers
No ratings yet
Hugging Face Transformers
8 pages
Lang Chain
No ratings yet
Lang Chain
8 pages
Python Regular Expressions Explained: A Practical Guide with Examples
From Everand
Python Regular Expressions Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
MLOps
No ratings yet
MLOps
9 pages
2023 Intro To Generative Ai
No ratings yet
2023 Intro To Generative Ai
15 pages
Simple Libraries in Python
No ratings yet
Simple Libraries in Python
12 pages
GenAI Pinnacle Roadmap
100% (1)
GenAI Pinnacle Roadmap
8 pages
10 Evani Generative AI Champion
No ratings yet
10 Evani Generative AI Champion
39 pages
PythonAI LLMs ForSharing
No ratings yet
PythonAI LLMs ForSharing
47 pages
1.data Mining Functionalities
No ratings yet
1.data Mining Functionalities
14 pages
Lesson Plan - CCS341 - DW-C
100% (1)
Lesson Plan - CCS341 - DW-C
5 pages
GenAI Interview Questions-Draft
No ratings yet
GenAI Interview Questions-Draft
27 pages
Cyberbullying Detection Based On Semantic Enhanced Marginalised Denoising Autoencoder - Report
No ratings yet
Cyberbullying Detection Based On Semantic Enhanced Marginalised Denoising Autoencoder - Report
71 pages
Ats Resume Masterguide
No ratings yet
Ats Resume Masterguide
9 pages
Generative AI APIs For Practical Applications
No ratings yet
Generative AI APIs For Practical Applications
27 pages
Iit Madras Resume Template 2 Page
No ratings yet
Iit Madras Resume Template 2 Page
3 pages
Data Mining & Data Warehousing
No ratings yet
Data Mining & Data Warehousing
62 pages
Database Management System - 071229
No ratings yet
Database Management System - 071229
44 pages
DBMS MCQ
No ratings yet
DBMS MCQ
12 pages
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Dbms Guide
No ratings yet
Dbms Guide
52 pages
TVL - Computer Systems Servicing: Quarter 2 - Module 2: Application Sofware
100% (2)
TVL - Computer Systems Servicing: Quarter 2 - Module 2: Application Sofware
23 pages
Full Information Fusion For Cyber-Security Analytics 1st Edition Izzat M Alsmadi PDF All Chapters
100% (8)
Full Information Fusion For Cyber-Security Analytics 1st Edition Izzat M Alsmadi PDF All Chapters
62 pages
Pet Shop Report
No ratings yet
Pet Shop Report
29 pages
Java
No ratings yet
Java
10 pages
Computer Science
No ratings yet
Computer Science
15 pages
Computer Institute - 1
No ratings yet
Computer Institute - 1
22 pages
Ececk6242505 PPT
No ratings yet
Ececk6242505 PPT
10 pages
XML and JSON
No ratings yet
XML and JSON
7 pages
DBMS Interview Questions
No ratings yet
DBMS Interview Questions
8 pages
Semantic Network Mode
No ratings yet
Semantic Network Mode
9 pages
Apoorva Machale
No ratings yet
Apoorva Machale
1 page
NN Presentation
No ratings yet
NN Presentation
10 pages
Varnika Resume Final
No ratings yet
Varnika Resume Final
2 pages
Data Wrangling Model Question Paper
No ratings yet
Data Wrangling Model Question Paper
2 pages
School Form 1 (SF 1) School Register
No ratings yet
School Form 1 (SF 1) School Register
2 pages
Cit 3200 Oprating Systems
No ratings yet
Cit 3200 Oprating Systems
2 pages
Difference Between Dynamic Cache and Static
No ratings yet
Difference Between Dynamic Cache and Static
1 page
Chapter 1
No ratings yet
Chapter 1
1 page
System and Its Life Cycle
No ratings yet
System and Its Life Cycle
8 pages
Advances in Information Retrieval 38th European Conference On Ir Research Ecir 2016 Padua Italy March 2023 2016 Proceedings 1st Edition Nicola Ferro Download
No ratings yet
Advances in Information Retrieval 38th European Conference On Ir Research Ecir 2016 Padua Italy March 2023 2016 Proceedings 1st Edition Nicola Ferro Download
78 pages

Hugging Face

Uploaded by

Hugging Face

Uploaded by

Transfarmers

Hugging Face’s Transformers

Dr. Ashaq Hussain Bhat

Dr. Ashaq Hussain Bhat

Dr. Ashaq Hussain Bhat

Dr. Ashaq Hussain Bhat

Dr. Ashaq Hussain Bhat

# Load custom dataset from CSV

# Tokenize the dataset

tokenized_datasets = dataset.map(preprocess_function, batched=True)

Dr. Ashaq Hussain Bhat

Dr. Ashaq Hussain Bhat

You might also like