BERT Slides

The document provides an outline and overview of BERT (Bidirectional Encoder Representations from Transformers), a popular language representation model. The outline discusses how language models are trained and inferred, the Transformer architecture, and how BERT is pretrained using masked language modeling and next sentence prediction tasks and then fine-tuned on downstream tasks like text classification and question answering.

Uploaded by

sshahid183

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views62 pages

BERT Slides

Uploaded by

sshahid183

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

BERT explained from

scratch
Umar Jamil
Downloaded from: https://github.com/hkproj/bert-from-scratch
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0):
https://creativecommons.org/licenses/by-nc/4.0/legalcode

Not for commercial use

Umar Jamil – https://github.com/hkproj/bert-from-scratch

Outline Prerequisites
• Structure of the Transformer model and how the attention mechanism works.
• Language Models
• Training
• Inference
• Transformer architecture (Encoder)
• Embedding vectors
• Positional encoding
• Self attention and causal mask
• BERT
• The importance of the left and the right context
• BERT pre-training
• Masked Language Model task
• Next Sentence Prediction task
• BERT fine-tuning
• Text Classification Task
• Question Answering Task

Umar Jamil – https://github.com/hkproj/bert-from-scratch

Outline
• Language Models
• Training
• Inference
• Transformer architecture (Encoder)
• Embedding vectors
• Positional encoding
• Self attention and causal mask
• BERT
• The importance of the left and the right context
• BERT pre-training
• Masked Language Model task
• Next Sentence Prediction task
• BERT fine-tuning
• Text Classification Task
• Question Answering Task

Umar Jamil – https://github.com/hkproj/bert-from-scratch

What is a language model?
A language model is a probabilistic model that assign probabilities to sequence of words.
In practice, a language model allows us to compute the following:

P [ “China” | “Shanghai is a city in” ]

Next Token Prompt

We usually train a neural network to predict these probabilities. A neural network trained on a
large corpora of text is known as a Large Language Model (LLM).

Umar Jamil – https://github.com/hkproj/bert-from-scratch

How to train a language model?
Imagine we want to train a language model on Chinese poems, for example the following one:

English Chinese (simplified)

Before my bed lies a pool of moon bright 床前明月光

I could imagine that it's frost on the ground 疑是地上霜
I look up and see the bright shining moon 举头望明月
李白 Bowing my head I am thinking of home 低头思故乡
Li Bai

Umar Jamil – https://github.com/hkproj/bert-from-scratch

How to train a language model?
Target sequence (10 tokens) Before my bed lies a pool of moon bright [EOS]

Cross Entropy Loss Run backpropagation to update the weights

Output sequence (10 tokens) TK1 TK2 TK3 TK4 TK5 TK6 TK7 TK8 TK9 TK10

Neural Network
(Transformer Encoder)

Input sequence (10 tokens)

[SOS] Before my bed lies a pool of moon bright

Umar Jamil – https://github.com/hkproj/bert-from-scratch

How to inference a language model?
Imagine you’re a (lazy) student who had to memorize Li Bai’s poem, but only remember the first two
words. How do you survive an exam?

Before my Ask the Language Model to write

the rest of the poem!

Prompt

English Chinese (simplified)

Before my bed lies a pool of moon bright 床前明月光

I could imagine that it's frost on the ground 疑是地上霜
I look up and see the bright shining moon 举头望明月
李白 Bowing my head I am thinking of home 低头思故乡
Li Bai