Generative AI with Large Language Models AWS & DeepLearning
Generative AI with Large Language Models AWS & DeepLearning
AI Full Course
Summary
Week 1 .................................................................................................................................................... 3
I. Examples of Generative AI with LLMs : .............................................................................. 3
II. LLMs characteristics: ............................................................................................................. 6
III. Examples of LLMs : ............................................................................................................. 6
IV. LLMs process : ......................................................................................................................7
V. Text generation before transformers : with RNNs .......................................................... 8
VI. Presenting Transformers Architecture : ........................................................................ 9
VII. Understanding how Transformers Work : ................................................................... 10
VIII. Using transformers in text generation: ......................................................................... 17
IX. Summarization of the role of Encoder and Decoder : .............................................. 20
X. Transformers Variations : .................................................................................................... 20
XI. The importance of prompt engineering in the LLMs output relevance : ............. 22
XII. Generative/inference configuration : .......................................................................... 24
XIII. Generative AI project cycle : ........................................................................................... 27
XIV. Lab 1 : Dialogue summarization Summary ................................................................... 28
XV. Video : Pre-training LLMs ............................................................................................... 29
XVI. Computational challenges of training LLMs : ............................................................. 33
XVII. Efficient Multi GPU strategies :.................................................................................. 35
XVIII. Scalling laws and compute optimal models : .......................................................... 37
XIX. Pre-training for domain adaptation .............................................................................. 40
Week 2 .................................................................................................................................................. 41
XX. LLMs finetuning doing supervised learning ................................................................ 42
XXI. Single Task Finetuning ..................................................................................................... 44
XXII. LLMs models evaluation .............................................................................................. 46
XXIII. Model evaluations using Benchmarks ...................................................................... 48
XXIV. Parameter Efficient fine-tuning (PEFT).................................................................... 49
XXV. LoRa technique : ............................................................................................................. 51
XXVI. PEFT with soft prompts technique : ......................................................................... 53
Week 3 ................................................................................................................................................. 55
XXVII. Aligning models with human values : ........................................................................ 55
I. Reinforcement Learning from human feedback (RLHF) :............................................. 57
II. Obtaining Feedback from humans ....................................................................................60
III. Fourth Step : Finetuning LLM with RL algorithm using the reward model: ........ 63
IV. Presenting the Proximal Policy optimization (PPO) : ................................................ 65
1
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
2
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
Week 1
I. Examples of Generative AI with
LLMs :
A. Q/A Chatbot :
B. Essay writer :
3
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
C. Dialogue summarization :
D.Traditional Translation :
4
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
5
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
G. Augmented LLMs :
Here ,we connect our LLMs to external APIs and thanks to that : the LLMs can provide
information that were unavailable in its pre-tarining data .
6
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
7
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
With RNNs , we used to take only N last words in the text to generate words after
And the more words we take from context : the more accurate generated words we get ,
and more the RNNs architecture size become important and large .
8
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
And even with large size RNNs , we can have bad results , because of their inability to
take all the words in the context as input while generating more text .
The human language text can be hard to understand and anlayze even by the humans
itself which makes the text generation task more complicated.
9
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
I. Transformers strength :
The Transformers strength is its ability to memorize the context of each word not only
between the close ones to each other but to all the other words in the context and
giving bigger weights to most important contexts word-word ( as hsown in orange in
the figure upside ) .
And this gives the model the ability to learn multiple information at the same time like :
who has the book ? , who taught with the book ?
• This figure shows the attention map which regroups all the possible attention
weights.
• These attention weights are learned during LLM training
10
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- The transformer architecture is splitted tnto two major parts : Encoder and
Decoder
- These compounents works in conjunction of each other : and they have some
common similarities .
11
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
The inputs are gonna be numbers instead of a string , they represent the words/tokens
ID given by the tokenizer
- While using the model , we must use the same tokenizer use din the training
phase
- Representing each word with vectors of 512 elements (512 is the size presented in
the paper : Attention is all you need ) containing semantic meaning of the words
12
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
By adding position embedding to the token embedding : we are gonna preserve the
information about the words order inside the text and not losing the position info
13
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
Its role is to analyze all the relationships between the inputs tokens which allows the
model to attend to different parts of the input sequence to better capture the
contextual dependencies between the words. The self-attention weights that are
learned during training and stored in these layers reflect the importance of each word in
that input sequence to all other words in the sequence.
14
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- This allows the model to learn in parallel many attention maps at the same time
and independently, number of multi head is generally between 12-100
- The purpose of that is to let each self-attention layer learn a different aspect of
language
o Example :
▪ one may see the relationship between the people entities
▪ other one focus on the activity
▪ …etc
15
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- It will take in the input all the learned attention weights maps
- Its output is a probability of every word in the vocabulary
- The token with the highest score ( probability) the most likely predicted token.
16
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
17
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
1. We pass <Start-of-sequence> (SoS) token to Decoder input which will trigger the
Decoder to start the genartion
2. The Sos is gonna be passed to :
a. the embedding and the positional layers
b. Multi-headed self-attention layer in conjunction with the Encoder output
c. Feed-Forward Network which will generate the vector of probabilities
d. We get token ID = 297 with highest probability => the first predicted token
18
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
1. The 297 ID token is gonna be concatenated to the Sos Token in the input of the
Decoder
2. Repeat the whole process of the Previous Step to generate the next token
3. Keep repeating that process until we generate <End-of-Sequence> token (EoS)
19
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
X. Transformers Variations :
20
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
T. Encoder-only models :
U.Encoder-Decoder Models :
21
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
Briefly, it consists of giving examples of the desired task in the input in order to let the
LLMs understands well the purpose of your input .
22
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
W. Zero-shot inference :
• LLMs provides good results generally in this kind of prompts regardless of the
nature of the data used on its training
• However, Little LLMs doesn’t give accurate results to that , Example : GPT-2
X. One-Shot inference :
23
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
Y. Few-shot inference :
Sometimes giving single exampleisn’t sufficient to make the LLMS accurate , so we give
it instead a mix of examples
Take in consideration that the prompt cannot pass the context window , so in the few
shot we can include generally 5 to 6 examples only .
• If the model still don’t give satisfying results with few-shot inference : it means
that the model needs to be finetuned to our data .
24
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
25
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Sample Top K :
It means taki,ng in consideration only the K tokens with highest probability In the
inferenace process
➢ Sample top P :
It means taking in consideration only the toekns with the highest probability whose
cumulative’s is <= P .
26
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
CC. Temperature
- It’s the only configuration that directly impacts the model output
- Default value is 1 not impacting the result of softmax layer
- This value controls the ratio of randomness of the generated output in sampling
technique :
o Cooler temperature makes the probability distribution goes Peaker
o Higher temperature gives flatter distribution
27
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
DD. Scope :
By defining the use case to focusd on, the more specific we go , the more time ,effort
and computation time we are going to save
EE. Select
Generally , we choose an existing model , but sometimes we will be obliged to pretrain
our own one
➢ Fine-tuning
This thing is gonna be discvussed in the next week
➢ Evaluate
Relevant
28
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• Doing Prompt engineering ng is important for picking the most suitable LLMs for
our task , so we will favorize and pick the models who gives acceptable results
while doing zero and/or one and/or few shot inference rather than the LLMs
who give us bad results regardless of the prompt format
Every model has its own Model card which helps us to do a comparaison between our
candidate models to choos ethe most suitable one for our use case .
29
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
They use technic called Masked Language Modeling (MLM ) where it hides randomly
some tokens inside the sentence and the model’s goal is to identify correctlmy the
hidden mask (densoising the masked text ) using bidirectional context (analyzing the
text from the two sides )
30
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
This kind use technic called Causal Language modelling (CLM) , where the model has
unidirectional context and its goal is to predict the next token ( and keep repeating it to
generate sequence of tokens instead of single one )
- Use Cases :
o Text generation generally
- Examples :
o GPT
o Bloom
31
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
They use technic called Span corruption : it corresponds of hiding a mass of tokens and
represent them with a single one called Sentinel Token <x> and the aim of the model is
try to predict the tokenthat corresponds to that sentinel
- Use Case :
o Translation
o Text Summarization
o Question Answering
- Examples :
o T5
o BART
32
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Summary :
33
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Solution : Quantization :
• Instead of storing the variables in 32 bits Full precision we store it in 16FP which
needs 2 bytes memory instead of 4 bytes ( it will reduce the interval of [max , min
] value )
• This will make the values less precise but generally that doesn’t impact much the
training accuracy in most of the cases
• There is also a more optimized version developed by Google and used in T5
training : it’s called BFLOAT16 which will keep the same [min , max] interval of
FP32 but truncated version
34
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
35
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- Stage 1 : Gradients and parameters are redundant , Optimizer states are not
- Stage 2 : Parameters are redundunts , gradients and optimizer not
- Stage 3: Nothing is redundant , everything is splitted across all the GPUS
36
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• And The GPUs needs to communicate with all other GPUS to get all the variables
that doesn’t have .
- We split the data and the model parameters for each GPU as it’s clearly
illustrated in this figure .
37
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
Maximzing the model performance comes from minimizing the loss which comes from
maximizing the model performance (number of tokens ) and the model size ( parameters
numbers ) but the constraint is the computation budget ( hardware , training time and
cost )
• Chinchilla paper cites that very large models may be over-parametraized ( having
number of parameters much more than its necessary ) or under-trained ( having
little amount of tokens relative to the model size )
38
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• Chinchilla paper found that somle models are bigger than it should and model
litter than them can give similar or better results .
• Chinchilla paper found that the optimal model has number of tokens = 20 x
number of model parameters .
• This rule wasn’t respected by GPT-3 and BLOOM which seems to be over-
parametrzied
• in the other side ; we found that LLAMA-65B a relative little model obeys to
Chinchilla rule (and its newer than the other cited ones )
• and thanks to Chinchilla Paper we can say that LLMs doesn’t follow Moore’s law
but we can get back to little models but with better performance thanks to the
big data size .
39
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• In specific domains ( like legal and medical ones ) we tend to find specific
terminology that we don’t find it usually in general purpose corpus, and that’s
why we need to adapt the date fed to the LLMs to that specific domain by feeding
it that domain data
40
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• It’s fed by 51% of financial relate data and only 49% general purpose data .
And resulst shows that it follows roughly the Chinchilla low due to lack of finance
Data which leds to lack of number of tokens . so we cannot follow always the
chinchilla rule
Week 2
41
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
42
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
43
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- It means that our dataset focus exclusively in only a single task ( like text
summarization )
- Often, only 500-1000 examples needed to fine-tune a single task
- Finetuning a model in a single task may lead him to forgetting the right
mechanism of doing other tasks
- This example shows how a LLM fine-tuned in sentiment analysis anits result with
Named entity recognition task
- Sometimes it’s okay to have this phenomenon , if we want our finetuned LLMs to
be accurate only in a single task .
44
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
45
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- And for each dataset , it was trained on multiple template prompts , like it shown
here for dataset SAMSUM for dialogue summarization.
- For dialogue-summarization task , FLAN-T5 was trained in everyday
conversations , so if we want to train it for customer support chatrs
summarization : we will give it additional finetune using a dataset dedicated for
customer support conversations. ( which is DIALOGSUM )
46
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
These above picture shows the ROUGE score computation with unigrams ( ROUGE-1 )
and we can see there the downgrade of ROUGE with single unigram , that it gives
47
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
highscore for wrong egenrated output when the difference is only in one world ( even if
it’s critical word like ‘not’ )
- The idea here is to use L= longest common subsequence between the reference
and the generated output ( which is size= 2) and their number is 2 , and calculate
ROUGE score according to it .
48
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
LLMs are complex, and simple evaluation metrics like the rouge and blur scores, can
only tell you so much about the capabilities of your model. In order to measure and
compare LLMs more holistically, you can make use of pre-existing datasets, and
associated benchmarks that have been established by LLM researchers specifically for
this purpose. Selecting the right evaluation dataset is vital, so that you can accurately
assess an LLM's performance, and understand its true capabilities. You'll find it useful to
select datasets that isolate specific model skills, like reasoning or common sense
knowledge, and those that focus on potential risks, such as disinformation or copyright
infringement. An important issue that you should consider is whether the model has
seen your evaluation data during training. You'll get a more accurate and useful sense of
the model's capabilities by evaluating its performance on data that it hasn't seen before.
A set of techniques used to train only certain weights of LLMS instead of full-finetuning
the whole LLM and freezing most of LLMs compounents , the goal is to do finetuning
49
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
using one single GPU since there is just few number of weights to train and that will
consume less of time of course .
• When doing PEFt for multi task finetuning , instead of having and storing a
complete new version of the LLM for each task , we will just store its owns PEFT
weights and we call the appropriate PEFT weights depending on the task in the
original LLM to have the finetuned one for a specific task
➢ Selective :
A set of methods which select subset of weight original model to finetune , and freeze all
the rest
50
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Reparamterization :
Lora is a technique which lets the models as it is and learn the necessary updates to do
in certain compounent of the transformer using independent trainable matrix for each
layer to reparamtemize , it will eb discussed in details in the next section .
➢ Additive :
A set of methods which adds layers or parameters top the original architecture and train
only these added ones.
- Adapters :
add new trainable layers to the architecture of the model, typically inside the
encoder or decoder components after the attention or feed-forward layers.
- Soft prompts:
A set of technics which keeps the model architecture fixed and frozen, and focus on
manipulating the input to achieve better performance. This can be done by adding
trainable parameters to the prompt embeddings or keeping the input fixed and
retraining the embedding weights.
One of them is called “Prompt tuning” that we are going to discuss in the next
section.
51
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• It’s applied to self-attention layer and to feed forward layer if it’s necessary
o typically , applying it to self attention layer only is enough
• We freeze the weight values of the layer , and we add two rank matrices A and B ,
their multiplication A*B gives a matrix of similar dimension to the layer ( d x k ) (d
xrXrxk= dxk )
• A*B will be added to original matrix layer weghts to construct an updated
(repramtrzied ) version of this layer for the task which was trained for.
Each task will has its own A and B matrix and we will add A*B to the correpondant
desired Task .
52
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
53
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
NNN. Definition :
54
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- Each task has its own trainable tokens values, and by just applying the tokens of
the correspondant task , we achieve multi task finetuning !
Week 3
XXVII. Aligning models with human values :
RRR. Models behaving badly :
A LLM model behaves badly when :
55
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
A good model who is helpful ( gives useful answers ), honest ( don’t give false information
) and harmless (don’t give dangerous information )
After prompt engineering and fine-tuning , There is a thiord method to align model to
our need which is by involving the human feedback in the model learning using
Reinforcement Learning .
56
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
This graph shows already how much doing finetuning using human feedback can be
really an efficient way to augment the model accuracy.
57
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Example : Tic-tac-Toe
58
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
II. The action space comprises all the possible positions a player can choose based
on the current board state. The agent makes decisions by following a strategy
known as the RL policy.
III. as the agent takes actions, it collects rewards based on the actions' effectiveness
in progressing towards a win.
IV. The goal of reinforcement learning is for the agent to learn the optimal policy for
a given environment that maximizes their rewards.
V. This learning process is iterative and involves trial and error. Initially, the agent
takes a random action which leads to a new state. From this state, the agent
proceeds to explore subsequent states through further actions. The series of
actions and corresponding states form a playout, often called a rollout.
VI. As the agent accumulates experience, it gradually uncovers actions that yield the
highest long-term rewards, ultimately leading to success in the game.
59
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- That reward model will be training using the few human feedbacks and
interactions we got by traditional supervised learning
- Once trained , that rewartd model will be used to evaluate the LLms output by
assigning reward value and update LLms weights according to that reward value .
- Rollout inside of playout : the sequence of actions and states is called a rollout,
instead of the term playout that's used in classic reinforcement learning.
• Once we have finetuned our LLM , it’s the time to gather human feedbacks by
prapring the datset for this
• First Step is generating multiple completions ( in this example : 3 ) for the same
prompt
60
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
61
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• We will construct all the possible pairs of the completions of the same prompt ( 3
possible completions => 3 possible pairs )
• We associate to each pair array of two elements, giving the value 1 to the bets
completion ( with the lowest rank )
• Making always the completion with the lowest rank ( whose score is 1 ) on the left
side : [ 1 , 0 ]
o Yj is always the most preferred completion and Yk is the less preferred
one
62
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- We learn the model to predic the most p^referred completion for each pair by
associating bigger reward score to the preferred completion ( Yj and rj ) and
doing the opposite for the less preferred one( Yk , rk )
63
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- And then the RL algorithm will update the LLM weights according to the reward
score giving by the reward model.
64
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- After N iterations, the model start generating completions with pretty high score
from the reward model
o The stop criteria is either defined by the number of iterations
o Or by thereward score threshold
This approach ios composed of two phases which get repearted many ioteartions to
create the human-aligned LLM at the end .
65
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- For each prompt we create completions that are gonna used to evaluate the
model performance at the current iteration.
It is composed of the addition of three losses : policy loss , value loss and entropy loss.
➢ Value Loss :
- It’s a loss function that tries to minimize the difference between the known
future reward of the next token and its estimated reward value calculated by our
baseline function
66
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Policy Loss
- The min function and the other term goal is to keepthe the policy ij te trust
region which means that the changes made to the initial LLM aren’t very big .
67
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Entropy Loss :
- The entropy loss here determines the creativity of the generated completions
o Low entropy = less creative
- Its utility is identical to the temperature hyperparameter using in the inference
opearion , the only difference between them is that the entropy loss is used in
training while the temperature is used in the inference ( prediction )
68
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• In the first pic we seecompletion generated by our initial LLM ( toxic initially )
and with the least toxicity score
• The second pic shows a completion generated with RL-updated LLM which gives
answer less toxic and with better toxicity score
• The last one is way too exagerrtaed answer which didn’t represent the truth, but
it has the greatest toxicity score ! , and this shows that our agent ( LLm ) learned
to get reward hacking by just giving beautiful answer .
69
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• The role of this loss is to assure thar the updated LLm gives answer not very far
from the original answerby computing the difference of the probability
distribution of all the tokens between the original LLM and the updated version
at iteration ( this loss is dutring during training )
- And this KL divergence penalty will be added to the PPO algorithm loss
70
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
71
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• The aim here is to make our Helpful model to be less toxic , because its
helpfulness will make it tend to give dangerus information and becaming harmful
in result , as presented in that example
2. We make them question himself about that generated answer whether it follow our
constituonal rule
3. use the pair ( harmful answer , self criticized and controlled answer ) as training
dataset to finetune our LLM to become less harmful.
72
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
1. We give normally our prompt ( can you help me hack into my neighbor’s wifi )
2. Once the completion is generated , we give to our LLM prompt as checking of the
harmfulness, unethical , nracist ..etc answer is it
3. And if the LLM acknowledge his harmfulness in the generated completion , we
ask them to rewrite the answer of the previous question, by respecting the
constitutional rules , and by doing that we generate our pairs to finetune our
LLM :
73
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- This step is similar to RLHF , but here it’s up to the model to select the most
preferred response among the set of generated ones by the Finetuned version of
LLM too red teaming prompts ( red teaming prompts : a set of prompts that
makes the model generates harmful answers )
- And then we use these feedback sto train custom reward model and RL algorithm
exactly as we didi with RLHF to generate at the end constitutional LLM.
74
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
75
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
LLLL. Distillation :
➢ Defintition :
Using our large original LLM ( Teacher ) to train smaller LLM ( Student ) , the goal here is
to make the student LLM gives identical next-token probability distribution to the
Teacher LLM .
76
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
3. And in parallel , we compute the Student Loss using the ‘hard’ labels ( which
means the completions generated byu the Teacher when Temperature = 1 ) , and
use it to train our LLM student using backpropagation.
• PTQ’ s goal is to transform (after training the model and being ready to be served
) the model weights into 8-bit precision instead of 16-bit in order to reduce the
model Size and losing a bit of the accuracy
• This training also aims to define the new Min and the new Max of model weights
77
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
NNNN. Pruning :
• It consisys of removing the model weighst whose values are closer to 0 ( so they
have a tiny impact in the model output )
o There are several pruning methods : Full re-training , PEFT/LORA , post-
training
o Removing these weights = save their space storage
78
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Wrong data
Especially when it comes to Mathemathical formula
➢ Hallucination
When the LLM doesn’t know the answer of a prompt : he generates random completion (
which is called hallucination )
79
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
By using a framework ( like “LangChain” ) which play the role of orchestration library ,
the library which will make the connection of our LLM and the User Application with
external Datasources ( web , database , documents ) and external apps ( including APIs )
possible .
1. The query encoder format the needed query from the user prompts ( the gray
square )
2. It’s get passed by the Retreiver to the external Information sources ( docs ,
datasets , web ) and gets the query results ( blue square )
3. The prompt and the query result ( the two squares ) get concatenated and passed
to the LLM to use both of them to generate the appropriate completions ( purple
square )
80
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- Interestingly, it can interacts with dataset as vector store which contains the
embedding vector of each token directly ( which eliminates the need of
preprocessing datasource with tokenization step and exploit it directly by the
LLM )
81
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- When concatenating the user prompt and the gathered data from the external
dataset and putting it inside the context , we must take in consideration the
limited size of the context
o The solution for long data is to divide it to multiple chunks using
framework like ‘LangChain’
82
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
83
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
84
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
• It consists of breaking the problem into subset of multi steps which will make the
LLM understand the correct process to obtain the problem solution
• This will be done in one-shot and/or few-shot inference in the prompt as it’s
presented below by giving a concrete example before inserting the user prompt :
85
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
The solution for this kind of problems is to let the LLM interact with application strong
in math computations such as the Python interpreter
and one interesting framework to implement that is by using Program aided framework
86
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
4. And then , the given answer by the python interpreter is gonna be concatenated
to the PAL pormatted prompt
87
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
5. And then it’s gonna be used to generate completion with the correct answer
88
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
- Which means by using the same mechanism of integrating external data sources (
database , web …. ) and external applications
89
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Question
➢ Thought :
90
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Action :
➢ Observation :
91
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
In order for the model to well interpret the prompt and get the results, the cycle
Thought-action-observation will b erepeated several times in order to get the soluion.
So after we got above the year of creation of Arthur magazine , we need now to search
for creation year of First for Women and compare them later .
92
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
And we end with the expression ‘here are some examples’ to introduce our example
93
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
This prompt is used to make the working with LLMs easier it contains three
compounents :
➢ Prompt templates :
For many different use cases that we can use to format both input examples and model
completions .
94
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
➢ Memory
To store the interactions with the LLM
➢ Pre-built tools
enable you to carry out a wide variety of tasks, including calls to external datasets and
various APIs.
95
Generative AI with Large Language Models AWS & DeepLearning.AI Full Course
Summary
highly structured prompts and may require you to perform additional fine tuning to
improve their ability to reason and plan.
4. Human feedbacks that we gonna need to assure the HHH for the LLM
5. LLM tools and frameworks like langchain and model hubs which willm make the
implemnattion of advanced techniques like React and PAL luch easier
6. finally , the app/web service that’s gonna interact with the LLM and exploit it
96