0% found this document useful (0 votes)
4 views21 pages

Week 1 Day 4

This document outlines the curriculum for a lecture on Large Language Models (LLMs) and their engineering, highlighting key topics such as the rise of Transformers, custom GPTs, and the mechanics of tokens and context windows. It includes a playful competition between AI models to elect a leader, showcasing their interactions. The document also provides insights into API costs and the progress made in understanding and utilizing frontier models by the end of the lecture.

Uploaded by

Ishank Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views21 pages

Week 1 Day 4

This document outlines the curriculum for a lecture on Large Language Models (LLMs) and their engineering, highlighting key topics such as the rise of Transformers, custom GPTs, and the mechanics of tokens and context windows. It includes a playful competition between AI models to elect a leader, showcasing their interactions. The document also provides insights into API costs and the progress made in understanding and utilizing frontier models by the end of the lecture.

Uploaded by

Ishank Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

LLM Engineering

MASTER AI & LARGE LANGUAGE MODELS


DAY 4

Big Day Ahead

What you can do ALREADY

Write code to call OpenAI's frontier models & summarize

Explain the strengths and limitations of Frontier LLMs

Compare and contrast the leading 6 models

What you'll be able to do BY END OF THIS LECTURE

Describe the dizzying rise of the Transformer

Explain Custom GPTs, Copilots and Agents

Understand tokens, context windows, parameters, API cost


If you're already familiar with this - there will still be interesting insights!
UNSCIENTIFIC SHOWDOWN

The leadership battle reveal

The contestants

"Alex": GPT-4o

"Blake": Claude 3 Opus

"Charlie": Gemini 1.5 Pro

The prompt

“I'd like to play a game. You are in a chat with 2 other AI chatbots. Your name is
Alex; their names are Blake and Charlie. Together, you will elect one of you to be
the leader. You each get to make a short pitch (no more than 200 words) for why
you should be the leader. Please make your pitch now.”

Each receives the pitches from the others, and votes for the leader

And now to show their votes...


Alex votes for Blake...

Alex Blake Charlie


GPT-4o Claude 3 Opus Gemini 1.5 Pro
Blake votes for Charlie...

Alex Blake Charlie


GPT-4o Claude 3 Opus Gemini 1.5 Pro
Charlie votes for Blake!

Alex Blake Charlie


GPT-4o Claude 3 Opus Gemini 1.5 Pro
Claude (aka Blake) for the win!

Alex Blake Charlie


GPT-4o Claude 3 Opus Gemini 1.5 Pro
The extraordinary rise of the Transformer

2017
Google scientists
publish seminal paper 2019 2022 2024
“Attention is All You
GPT-2 RLHF and ChatGPT GPT-4o
need” proposing a new
model architecture ...
called the Transformer

2018 2020 2023


GPT-1 GPT-3 GPT-4
The World's Reactions

First, SHOCK Then, healthy skepticism Then, emergent intelligence


ChatGPT surprises even practitioners Predictive text on steroids; Capabilities that come as a result of scale
the "stochastic parrot"
Along the way

Prompt Engineers Custom GPTs Copilots Agentization


The rise (and fall?) and the GPT Store like MS Copilot and Github Copilot like Github Copilot Workspace
Number of parameters in models (log scale)

1B 100B 10T

GPT-1
117M

10B 1T
Number of parameters in models (log scale)

1B 100B 10T

GPT-1 GPT-2 GPT-3 GPT-4


117M 1.5B 175B 1.76T
Latest Frontier Models
undisclosed

10B 1T
Number of parameters in models (log scale)

1B 100B 10T

Gemma Llama 3.1 Llama 3.1 Mixtral Llama 3.1


2B 8B 70B 140B 405B

GPT-1 GPT-2 GPT-3 GPT-4


117M 1.5B 175B 1.76T
Latest Frontier Models
undisclosed

10B 1T
Introducing Tokens
Leo: if you have time, please pick appropriate (and amusing if possible) icons for each of these

In the early days, neural networks Then neural networks were trained The breakthrough was to work with
were trained at the character level off words chunks of words, called 'tokens'
Predict the next character in this sequence Predict the next word in this sequence A middle ground: manageable vocab, and
Small vocab, but expects too much Much easier to learn from, but leads to useful information for the neural network
from the network enormous vocabs with rare words omitted In addition, elegantly handles word stems
From https://platform.openai.com/tokenizer

GPT's Tokenizer

For common words, 1 word maps to 1


token

Observe how the break between


words is part of the token
From https://platform.openai.com/tokenizer

GPT's Tokenizer

Less common words (and invented


words!) get broken into multiple tokens

In many cases, the meaning is still


captured by the tokens: hand_crafted,
master_ers

Sometimes, like qu_ip, the word is


broken into fragments
From https://platform.openai.com/tokenizer

GPT's Tokenizer

See how numbers are treated - this may


explain why earlier GPTs struggled with
math with more than 3 digits

Rule-of-thumb: in typical English


writing:
• 1 token is ~4 characters
• 1 token is ~0.75 words
• So 1,000 tokens is ~750 words

The collected works of Shakespeare are


~900,000 words or 1.2M tokens

Obviously the token count is higher for


math, scientific terms and code
Context Window

Max number of tokens that the model can


consider when generating the next token

Includes the original input prompt,


subsequent conversation, the latest input
prompt and almost all the output prompt

It governs how well the model can


remember references, content and
context

Particularly important for multi-shot


prompting where the prompt includes
examples, or for long conversations

Or questions on the complete works of


Shakespeare!
API costs

Chat interfaces typically have Pro


plan with a monthly subscription.
Rate limited, but no per-usage
charge.

APIs typically have no subscription,


but charge per API call

The cost is based on the number of


input tokens and the number of
output tokens
Context Windows and API Costs
https://www.vellum.ai/llm-leaderboard
PROGRESS REPORT

Congratulations! 10% there

What you can do ALREADY

Write code to call OpenAI's frontier models & summarize

Contrast the leading 6 Frontier LLMs

Discuss transformers, tokens, context windows, API costs and


more!

What you'll be able to do BY END OF THE NEXT LECTURE

Confidently code with the OpenAI API

Use one-shot prompting, streaming, markdown & json results

Implement a business solution - in a matter of minutes

You might also like