Slides
Slides
Click toGPT,
edit and Prompt
Master title style
Engineering for
Developers
Sinan Ozdemir
Data Scientist, Entrepreneur,
Author, Lecturer
LLMs,
Click toGPT,
edit and Prompt
Master title style
Engineering for
Developers
Session 1: Introduction
Sinan Ozdemir
Data Scientist, Entrepreneur,
Author, Lecturer
Brief History of Modern NLP
Click to edit Master title style
2001 2014–2017
Neural Language Seq2seq +
Models Attention
2013 2017–Present
encoding semantic Transformers + Large
meaning with Language Models
Word2vec
95%
5%
Language Models
Click to edit Master title style
In a language modeling task, a model is trained
to predict a missing word in a sequence of words.
• Auto-regressive
• Auto-encoding
Auto-__ Language Models
Click to edit Master title style
Auto-regressive Models Auto-encoding Models
If you don’t ___ (forward prediction) If you don’t ___ at the sign, you will get a ticket.
Auto-__ Language Model Use Cases
Click to edit Master title style
Auto-regressive Models Auto-encoding Models
LLM Response
Source: ChatGPT Playground
Tradeoffs Between Different LLMs
Click to edit Master title style
• Auto-encoding models like BERT are fast at encoding
semantic meaning for Understanding tasks but
cannot generate free text.
Auto-encoding
language Relying on The encoder is
model attention taken from the
Uses only the
encoder from the transformer
transformer architecture
Source:
https://nlp.stanford.edu/pubs/clark2019what.pdf
T5
Click to edit Master title style
Text to Text Transfer Transformer
Relying on A pure
A sequence to transfer
sequence model transformer
learning using both the
and a fifth “t”!
encoder and
decoder
Source: https://arxiv.org/pdf/1910.10683.pdf
GPT
Click to edit Master title style
Generative Pre-trained Transformers
“We scraped all outbound links from Reddit ... which received at
least 3 karma ... [resulting in] 45 million links”
Source: https://ai.meta.com/resources/models-and-libraries/llama/
Pre-training LLaMA 2 – Corpus
Click to edit Master title style
They claim it was trained on “2 trillion tokens” of data but the
paper never specifies exactly what data it was trained on, just that
it was from the “web, mostly in English”.
This speaks to the biases found in LLMs and may also speak to the
legal controversies surrounding data used to train LLMs.
Intro to LLMs
LLMs,
Click toGPT,
edit and Prompt
Master title style
Engineering for
Developers
Session 2: Working with Pre-trained LLMs
Sinan Ozdemir
Data Scientist, Entrepreneur,
Author, Lecturer
Applying LLMs
Click to edit Master title style
We can use LLMs in (generally) three ways:
Source:
https://tech.ebayinc.com/engineering/how-ebay-created-a-language-model-with-three-billio
n-item-titles
Semantic Search
Click to edit Master title style
Source:
https://www.sbert.net/examples/applications/seman
tic-search/README.html
Semantic Search
Click to edit Master title style
Document
Corpora
(context)
Symmetric Search
Fine-tune model on
task/domain specific ...
supervised task
Transfer Learning with BERT
Click to edit Master title style
Additional
Task Layers
Pre-trained
BERT
Training
data for
Selecting a source model second
task
Fine-tuning LLMs
Why Fine-Tune?
Click to edit Master title style
1. Improves task-specific performance by enabling the model to tailor its
knowledge to specific tasks, leading to improved performance and accuracy.
2. Custom data ensures that the model is trained on information that is relevant
and specific to your use-case, making its output more applicable and
accurate.
3. Fine-tuning with custom data enables the model to better understand and
respond to industry-specific jargon, regional language nuances, or other
unique data aspects.
negative 0.1
Feedforward + Softmax
Encoder 12
.......
Pre-trained BERT Encoder 1
RLHF - A Primer
LLM Alignment - Reinforcement Learning from Feedback
Click to edit Master title style
- Fine-tuning to subtly adjust an LLM’s output.
Data Collator – A tool for processing and preparing input data for a
model. It transforms raw input data into a format that the model
can understand, which may involve tokenization, padding, and
batching.
Sinan’s Attempt at
Wise Yet Engaging
Responses
Sinan’s Attempt at
Wise Yet Engaging
Responses
Sinan’s Attempt at
Wise Yet Engaging
Responses
Sinan’s Attempt at
Wise Yet Engaging
Responses
Assume they charge $0.0004 per 1000 tokens for the embedding engine we used
(Ada-002).
If we assume an average of 500 tokens per document (roughly a page of text), the
cost per document would be $0.0002.
Instead of thinking about a cost per token we would want to estimate things like:
LLM Response
Source: ChatGPT Playground
LLMs,
Click toGPT,
edit and Prompt
Master title style
Engineering for
Developers
Session 3: Prompt Engineering
Sinan Ozdemir
Data Scientist, Entrepreneur,
Author, Lecturer
Prompt Engineering LLMs
Click to edit Master title style
JUST ASK
Source: OpenAI
Few-Shot Learning with GPT-3
Click to edit Master title style
Given a description of a
book output:
a. “yes” if the
description is
subjective or
b. “no” if the
description is
objective
huggingface.co/datasets/gsm8k
Closest K=5
0.788 0.601
(CoT)
Closest K=7
0.774 0.574
(CoT)
Random K=3
0.744 0.585
(CoT)
Closest K=1
0.709 0.519
(CoT)
Just Ask
0.628 0.382
(with CoT)
Closest K=3
0.27 0.18
(no CoT)
Just Ask
0.2 0.09
(no CoT)
Source: Quick Start Guide to LLMs
by Sinan Ozdemir
Injecting Personas into Prompts
Click to edit Master title style
Next token
predictions
happen one
token at a
time
Source: https://jalammar.github.io/illustrated-gpt2/
Parameters for generating text
Click to edit Master title style
temperature (float) - Lower (below 1) makes the model more
confident and less random. Higher values make generated
text more random.
Source: https://huggingface.co/blog/how-to-generate
Temperature - Continued
Click to edit Master title style
B. Mostly. It knows the information but it lacks critical information (information is too
new to be in the model or it knows a topic but not to the specifics that I need).
a. Create a secondary system to retrieve information on demand (e.g., semantic
search)
b. Few-shots and chain of thought to help teach nuances/specifics
C. No, not at all, I need to teach it pretty much everything from scratch.
a. Just ask with comprehensive instructions + frameworks
b. Few-shot / chain of thought prompting
c. Fine-tuning for long term cost savings/speed
Does the LLM Know Enough for My Task?
Click to edit Master title style
A. Yes, it has all knowledge encoded and it is ready to solve my task.
a. Summarizing news articles
b. Recommending news articles from a list of articles
B. Mostly. It knows the information but it lacks critical information (information is too
new to be in the model or it knows a topic but not to the specifics that I need).
a. Recommending news articles that came out this morning
C. No, not at all, I need to teach it pretty much everything from scratch.
a. Recommending proprietary frameworks for thinking about marketing
strategies
Click to edit Master title style
● Identify and document the specific inputs and outputs for your LLM application.
● Example: Given a user's taste and a list of book descriptions, the model should output a ranked list of
book recommendations with reasons.
● Remember, requirements might change during testing or in different contexts.
● Assess if the model has the necessary information to perform the task.
● Class A: The model has all the required information encoded.
● Class B: The model mostly has the necessary information but lacks specific details or updated
data.
● Class C: The model lacks the majority of required knowledge and needs extensive training.
● Create various versions of a prompt and experiment with them in the model's playground. This helps to
refine the prompts and assess the model's knowledge requirement.
● Adjust the parameters like temperature and top-p to refine the model's responses.
Sinan’s LLM Prototyping Framework
Click to edit Master title style
8. Evaluate and Plan for Scale/Production/Cost/Testing
● Assess the performance of the model, including its computational demands, and plan for
potential scaling and production deployment.
● Also, consider the cost of deployment, which includes financial costs (like cloud resources
and potential fine-tuning) and resource costs (like time and personnel for testing and
maintenance).
● Create a basic version of the model using tools like Streamlit for quick testing and user
feedback.
● Iterate on the model by refining the prompts, adjusting parameters, and fine-tuning the
model based on feedback.
Sinan’s LLM Prototyping Framework
Click to edit Master title style
10. Labeling Data and Fine-tuning
● Plan for potential data labeling and fine-tuning. This includes considering the cost and time required for
these steps.
● Remember, fine-tuning not only requires labeled data but also extensive computational resources, which
can add to the overall cost.
11. Evaluation
● Consistently evaluate the model's performance using relevant metrics like semantic similarity, precision,
recall, etc. These evaluations will guide the iterations and improvements.
The above framework is not exhaustive but provides a good starting point for designing applications with
LLMs like ChatGPT. Each application will have unique needs and constraints, so this framework should be
adapted accordingly.
Week 1 Assignment
Click to edit Master title style
1. Come prepared with at least 2 examples of a task to solve with an LLM
a. Should fit within the idea of a larger product
2. For 1 example, complete the first 5 steps of designing an LLM application/feature
The examples I will walk through: (inspired on a recent trip to my favorite wine bar):
Product: A platform for sommeliers to keep track of their customers/clients to help give
recommendations
LLM Task 1: Given a list of wines my client liked with descriptions for the wines plus a list of
wines I have with descriptions, output an ordered subset of recommendations with reasoning
(My hunch is is B)
LLM Task 2: Given a lengthy wine description, output a summarization of the wine (Hunch: A)
• LLMs are not perfect and will eventually produce untrue and
harmful statements if left unchecked.
Includes: