0% found this document useful (0 votes)

51 views14 pages

Transformers, How Do They Work?: Generative AI To Create Content

Generative AI models like GPT-3 work by predicting the next word in a text based on preceding words. They encode text as n-grams, or groups of words, and build an inference matrix to predict the most likely next words. However, when there are multiple possible next words, as in the example of "Robert has a white [dog/cat]", the model has to randomly pick one, potentially leading it to generate incorrect or nonsensical text, known as hallucinations. LLM outputs are not always deterministic due to inherent ambiguities in language.

Uploaded by

Antony Alex

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views14 pages

Transformers, How Do They Work?: Generative AI To Create Content

Uploaded by

Antony Alex

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Transformers, how

do they work?

Generative AI to create content

Introduction

Created by Google in 2017 and have a system of attention. This gives different weights for the
signiﬁcance of the training data.
Generative AI is a just a prediction
machine
Inference, or predicting the next word based on the previous word is at the
heart of generative AI. Compared to AI 20+ years ago, machines are faster and
more powerful so they can train models that are magnitudes bigger than
former models.

Let’s run through an example

Given a simple sentence

Lets say you have a sentence like

“To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer”

- Shakespeare
We encode it using n grams

N grams is just a grouping of words with “n” elements. N in this case can be anything, lets say 3
and are commonly referred to as tokens.

In the previous sentence:

“To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer”

- Shakespeare
Lets break this apart into 3-Grams
N Gram conversion

“To be, or not to be, that is the question: Whether 'tis nobler in the mind to
suffer”

Becomes

(to, be, or), (be, or, not), (or, not, to), (not, to, be), (to, be, that), …

This is after stripping the punctuation and turning everything into a lowercase
character
Lets build the inference

Now that we have the 3-grams generated, then we can create an inference matrix stating what
the next words can be for each n-gram

(to, be, or) -> not

(be, or, not) -> to

(or, not, to) -> be

(not, to, be) -> that

(to, be, that) -> is

So on and so forth…
What happens next depends on what
happened in the past

We take a look at the existing input and then infer what the next word will be
depending on the existing input. This is why we must give a prompt, this acts as a
seed and will start the inference cycle.

Lets say we give a prompt and we’re looking at 3 tokens:

To be or not

The input to the LLM is now (be, or, not)

Now lets infer…

Based on the input 3-gram (be, or, not) we will need to infer what the next word will be.

Looking at the previous inference matrix, we can infer that if we have an 3-gram of (be, or, not)
then the next logical word will be that

So now we have a prompt that is:

To be or not to

Now we feed the last 3-gram into the LLM again and get the next predicted word

(or, not, to) which infers be

This leads to a newly generated prompt of: To be or not to be

Hallucinations happen to everyone

The previous example is very simple, but let’s give another one that’s trickier.

Robert has a white dog, Kathy has a white cat.

Going through the previous examples, lets generate the 3-grams

(Robert, has, a), (has, a, white), (a, white, dog), (white, dog, Kathy), (dog, Kathy, has), (Kathy, has,
a), (has, a, white), (a, white, cat)
Lets build the inference

The previous 3-gram conversion becomes

(Robert, has, a) -> white

(has, a, white) -> cat, dog

(a, white, dog) -> Kathy

(white, dog, Kathy) -> has

(Kathy, has, a) -> white

(a, white, cat) -> [end of sentence]

Theres a problem, can you see it?

But theres a fatal problem! Lets prompt the LLM with the prompt,

Robert has a

From the previous 3-grams we know that this 3 gram only has one option afterwards which is
white

Now the prompt is Robert has a white

Feeding the last 3 gram back in (has, a, white) we now have 2 options to pick from, cat or dog.

What do we do?
We ﬂip a coin

Since cat and dog are both equally likely to be picked, we will randomly ﬂip a coin and pick the
next word. This is why sometimes LLMs hallucinate.

Lets say we pick dog,

Robert has a white dog, this is correct

However lets say we pick cat

Robert has a white cat, this is not correct

LLMs hallucinate because output is not
deterministic

LLMs generally pick the next word based on the existing input. When
phrasing is common, then tend to hallucinate since there are more
options for next words.

CTRF
No ratings yet
CTRF
2 pages
N-Grams and Smoothing: CSC 371: Spring 2012
No ratings yet
N-Grams and Smoothing: CSC 371: Spring 2012
39 pages
LM 24 Aug
No ratings yet
LM 24 Aug
84 pages
Module-1 ch-2
No ratings yet
Module-1 ch-2
31 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
Multimedia Application L6
No ratings yet
Multimedia Application L6
63 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
Multimedia Application L5
No ratings yet
Multimedia Application L5
35 pages
13 Ai Cse551 NLP 1 PDF
No ratings yet
13 Ai Cse551 NLP 1 PDF
50 pages
Language Modeling
No ratings yet
Language Modeling
88 pages
NLP 1.2
No ratings yet
NLP 1.2
22 pages
Untitled Document
No ratings yet
Untitled Document
6 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
Predicting Words and Sentences Using Statistical Models: Nicola Carmignani
No ratings yet
Predicting Words and Sentences Using Statistical Models: Nicola Carmignani
42 pages
Paniit Demystifying Llms
No ratings yet
Paniit Demystifying Llms
66 pages
Lec-3 Language Modeling N-Grams
No ratings yet
Lec-3 Language Modeling N-Grams
41 pages
NLP Week 02
No ratings yet
NLP Week 02
55 pages
N-Grams and Corpus Linguistics: Julia Hirschberg
No ratings yet
N-Grams and Corpus Linguistics: Julia Hirschberg
47 pages
Lecture5 Ngrams
No ratings yet
Lecture5 Ngrams
40 pages
Chat GPTwebinar
No ratings yet
Chat GPTwebinar
16 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
Generative AI Exists Because of The Transformer
No ratings yet
Generative AI Exists Because of The Transformer
52 pages
Entropy, Finally A Real Cure To Hallucinations? This Is One Unexpected Story
No ratings yet
Entropy, Finally A Real Cure To Hallucinations? This Is One Unexpected Story
16 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
2024 Inlg-Main 3
No ratings yet
2024 Inlg-Main 3
6 pages
NLP Lec 11
No ratings yet
NLP Lec 11
6 pages
Intro DL 10 NLP
No ratings yet
Intro DL 10 NLP
99 pages
Unit 3-Notes AI
No ratings yet
Unit 3-Notes AI
36 pages
LLM Book 43-102
No ratings yet
LLM Book 43-102
60 pages
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
No ratings yet
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
32 pages
Lecture04-Ngram Lang Models
No ratings yet
Lecture04-Ngram Lang Models
39 pages
What Is NLP?: Components of An FSA
No ratings yet
What Is NLP?: Components of An FSA
16 pages
NLP EXP 3 (B) - Word Generation
No ratings yet
NLP EXP 3 (B) - Word Generation
2 pages
Text Generation Using Markov Chain
No ratings yet
Text Generation Using Markov Chain
13 pages
Lecture 6 To 8 N-Gram
No ratings yet
Lecture 6 To 8 N-Gram
19 pages
Module03 Embeddings
No ratings yet
Module03 Embeddings
102 pages
Unit 5 - Notes
No ratings yet
Unit 5 - Notes
11 pages
NLP - Module 2
No ratings yet
NLP - Module 2
77 pages
2 Generative Models
No ratings yet
2 Generative Models
60 pages
N Grams
No ratings yet
N Grams
51 pages
Natural Language Processing - Notes - Unit 2
No ratings yet
Natural Language Processing - Notes - Unit 2
12 pages
Lecture03 Naive Bayes
No ratings yet
Lecture03 Naive Bayes
33 pages
Natural Language Processing - Notes - Unit 2
No ratings yet
Natural Language Processing - Notes - Unit 2
19 pages
692C TentativeSyllabus
No ratings yet
692C TentativeSyllabus
4 pages
Language Modelling
No ratings yet
Language Modelling
3 pages
NLP 2-5 Unit Notes
No ratings yet
NLP 2-5 Unit Notes
83 pages
NLP Viva
No ratings yet
NLP Viva
14 pages
Lecture 03
No ratings yet
Lecture 03
41 pages
Lecture 02
No ratings yet
Lecture 02
31 pages
Assignment 3
No ratings yet
Assignment 3
5 pages
Unit - 4
No ratings yet
Unit - 4
21 pages
2 N-Gram
No ratings yet
2 N-Gram
70 pages
NLP Unit-4
No ratings yet
NLP Unit-4
62 pages
Chapter 2. Transformers: A Note For Early Release Readers
No ratings yet
Chapter 2. Transformers: A Note For Early Release Readers
85 pages
N-Gram Language Models Lecture
No ratings yet
N-Gram Language Models Lecture
56 pages
NLP UNIT III (Part 1)
No ratings yet
NLP UNIT III (Part 1)
15 pages
N-Gram Language Models Lecture
No ratings yet
N-Gram Language Models Lecture
59 pages
N-Gram Language Models
No ratings yet
N-Gram Language Models
26 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
Lec 1.1.2
No ratings yet
Lec 1.1.2
44 pages
Hypothesis Testing Made Simple
From Everand
Hypothesis Testing Made Simple
Leonard Gaston
4/5 (5)
How To Make A Suppressor (With Pictures) - Wikihow
No ratings yet
How To Make A Suppressor (With Pictures) - Wikihow
6 pages
Designing For Clarity Author Bianca Woods
No ratings yet
Designing For Clarity Author Bianca Woods
61 pages
Performance Management (Final)
No ratings yet
Performance Management (Final)
16 pages
LogicEditor enUS
No ratings yet
LogicEditor enUS
254 pages
FPGA Implementation of Simplified SVPWM Algorithm For Three Phase Voltage Source Inverter
No ratings yet
FPGA Implementation of Simplified SVPWM Algorithm For Three Phase Voltage Source Inverter
8 pages
Mathematical Modeling of A Battery Energy Storage System in Grid Forming Mode
No ratings yet
Mathematical Modeling of A Battery Energy Storage System in Grid Forming Mode
6 pages
Module 2 Interpersonal Communication Activity 2
No ratings yet
Module 2 Interpersonal Communication Activity 2
1 page
TWGMC 1N4007 - C727081 - Diode 1N4001 Surface Mount
No ratings yet
TWGMC 1N4007 - C727081 - Diode 1N4001 Surface Mount
3 pages
Practice Test 1 Answers
No ratings yet
Practice Test 1 Answers
30 pages
Dissrtatn Cmplte PDF
No ratings yet
Dissrtatn Cmplte PDF
162 pages
2020 Hwang, Effects, Multi-Level Concept Mapping-Based Question
No ratings yet
2020 Hwang, Effects, Multi-Level Concept Mapping-Based Question
17 pages
FotoFocus Biennial 2016 Marlo Pascual Three Works Gallery Guide
No ratings yet
FotoFocus Biennial 2016 Marlo Pascual Three Works Gallery Guide
3 pages
VTP Interview Questions and Answers (VLAN Trunking Protocol) - Networker Interview
100% (1)
VTP Interview Questions and Answers (VLAN Trunking Protocol) - Networker Interview
2 pages
Agri Surfactants Handbook - V14 - 280225 - ENGLISH
No ratings yet
Agri Surfactants Handbook - V14 - 280225 - ENGLISH
35 pages
Preparation of Blood Films For Malaria Detection
No ratings yet
Preparation of Blood Films For Malaria Detection
10 pages
PowerPoint Presentation
No ratings yet
PowerPoint Presentation
60 pages
All of The Documentation - Electron
No ratings yet
All of The Documentation - Electron
315 pages
Ultrasonic Calculator
No ratings yet
Ultrasonic Calculator
6 pages
A New Way To PFC and An Even Better Way To LLC
No ratings yet
A New Way To PFC and An Even Better Way To LLC
30 pages
Int J Mental Health Nurs - 2003 - Happell - Burnout and Job Satisfaction A Comparative Study of Psychiatric Nurses From
No ratings yet
Int J Mental Health Nurs - 2003 - Happell - Burnout and Job Satisfaction A Comparative Study of Psychiatric Nurses From
9 pages
Evolution of Entrepreneurship: The 17 Century The Middle Ages The Earliest Stage
0% (1)
Evolution of Entrepreneurship: The 17 Century The Middle Ages The Earliest Stage
2 pages
Oscar Ccoa Codes v1
No ratings yet
Oscar Ccoa Codes v1
247 pages
Manual de Instalación XLED
No ratings yet
Manual de Instalación XLED
92 pages
Hindu Conceptions of Law
No ratings yet
Hindu Conceptions of Law
25 pages
Determination of The Observed Value of C
No ratings yet
Determination of The Observed Value of C
401 pages
Emcee Script
100% (2)
Emcee Script
2 pages
Physics: Motion in One Direction: Instantaneous Velocity and Acceleration
No ratings yet
Physics: Motion in One Direction: Instantaneous Velocity and Acceleration
11 pages
UiPath Logo Partner Guidelines
No ratings yet
UiPath Logo Partner Guidelines
10 pages
Practice Problem Set#2
No ratings yet
Practice Problem Set#2
2 pages

Transformers, How Do They Work?: Generative AI To Create Content

Uploaded by

Transformers, How Do They Work?: Generative AI To Create Content

Uploaded by

Transformers, how

Generative AI to create content

Let’s run through an example

Lets say you have a sentence like

In the previous sentence:

(to, be, or) -> not

(be, or, not) -> to

(or, not, to) -> be

(not, to, be) -> that

(to, be, that) -> is

Lets say we give a prompt and we’re looking at 3 tokens:

The input to the LLM is now (be, or, not)

So now we have a prompt that is:

(or, not, to) which infers be

This leads to a newly generated prompt of: To be or not to be

Robert has a white dog, Kathy has a white cat.

Going through the previous examples, lets generate the 3-grams

The previous 3-gram conversion becomes

(Robert, has, a) -> white

(has, a, white) -> cat, dog

(a, white, dog) -> Kathy

(white, dog, Kathy) -> has

(Kathy, has, a) -> white

(a, white, cat) -> [end of sentence]

Now the prompt is Robert has a white

Lets say we pick dog,

Robert has a white dog, this is correct

However lets say we pick cat

Robert has a white cat, this is not correct

You might also like