0% found this document useful (0 votes)

11 views40 pages

Know Thy Frenemy

Uploaded by

brianpeiris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views40 pages

Know Thy Frenemy

Uploaded by

brianpeiris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Know Thy Frenemy:

Understanding LLMs –
Past, Present, and
Future

Barak Shoshany
Department of Physics, Brock University
Motivation
• Many professors and students use LLMs.
• Other talks focus on implications of LLMs.
• This talk focuses on LLMs themselves.
• Understanding LLMs better will help professors:
• Optimize use.
• Dispel misconceptions.
• Develop situational awareness.
• Incorporate in courses.
• Instruct students on proper use.
Part I: Past
Neural networks 1/5
• Curve fitting: Find function that
approximates data.
• Minimize error.
• Example: Approximate by
polynomial.
• More parameters = better fit.
Neural networks 2/5
• Neural network: “Extremely
sophisticated curve fitting.”
• Mathematical model of brain
(≈100 billion neurons).
• Width: # of neurons per layer.
• Depth: # of hidden layers.
• Deep network: multiple hidden
layers.
• Deeper layer = more abstract.
Neural networks 3/5
• Each connection has a weight
(≈importance).
• Each neuron does a weighted
sum of previous neurons.
• Result passed through non-
linear activation function.
• Universal approximation
theorem: Any function can be
approximated by a neural
network with enough neurons.
Neural networks 4/5
• Example – image to text (simplified):
• Input layer = pixels.
• Early hidden layers = edges, orientation, colors.
• Middle hidden layers = textures, motifs.
• Late hidden layers = objects, scene context.
• Output layer = text description of image.

“A kitten playing with a

ball of yarn.”
Neural networks 5/5
• To determine the weights, we do supervised training.
• Can start with collection of image-text pairs.
• Available data: ≈100 billion pairs.
• Initial weights completely random.
• Feed image to network, get guess of text (nonsense at first).
• Measure how wrong guess is (loss).
• Back-propagation: Trace error backwards through layers, adjust
weights to correct.
• Repeat for all pairs (epoch) and multiple epochs, minimize loss.
How LLMs work 1/5
• LLM = Large Language Model.
• Probabilistic model to predict next token in text.
• Token: letter / symbol / word / part of word.
• Example:
The cat → The cat is → The cat is fluffy
• Pre-training: Self-supervised learning on trillions of tokens (web,
books, code, etc.)
• Self-supervised: Just take existing text and hide a token.
The cat is _____ → try to predict → fluffy? → minimize loss
How LLMs work 2/5
• After pre-training, LLM just “autocompletes”; not yet a chatbot.
Are cats fluffy? → Are cats fluffy? What about dogs?
• Instruction-tuning: Train with samples of chats:
User: Is salt salty?
Assistant: Yes, salt is salty.
• Reinforcement Learning from Human Feedback (RLHF): Humans give
“thumbs up/down” to chat replies.
• Base model turns into helpful chat/instruct model.
User: Are cats fluffy?
Assistant: Some are, some aren’t; depends on the breed.
How LLMs work 3/5
• Current LLMs: based on transformer architecture.
• Attention heads can “look back” on all tokens (within context
window).
• “Pay attention” to earlier words, find links and connections.
• Work on all tokens in parallel – much faster than linear models.
• Can run on GPUs (Graphics Processing Units).
• Scale up easily; larger models are better.
• Latest models have trillions of weights!
• Transformers caused the AI “boom”.
How LLMs work 4/5
• Embedding: Each token = point (vector) in very high-dim space.
• ≈10,000-30,000 dimensions.
• Naturally occurs as side-effect of training.
• Represents context-agnostic semantics.
• Transformer layers refine this to true context-aware meaning.
• Example: Embedding of “spring” represents “springness”.
• Flowers bloom in spring → season context.
• The spring absorbs shocks → object context.
How LLMs work 5/5
• Subspaces of embedding space correspond to abstract concepts.
• “gender” axis: “she” and “he” in opposite directions.
• “plurality” axis: “dog” and “dogs” in opposite directions.
• Can do “arithmetic”: king - male + plural = queens.
• Embeddings + transformers = true mathematical representation of
language.
• Impossible to do manually!
• Original use for transformers: machine translation.
English → embeddings → French
Part II: Present
The LLM zoo 1/6
• Understanding model types:
• Non-reasoning models: Predict next token with no prior planning.
• Reasoning models: Perform long and complicated reasoning, solve
problems at PhD/research level. More expensive.
• Big models: Trillions of weights. Expensive to use. Large knowledge base.
• Small (“mini”/”flash”) models: Fewer weights. Distilled from big models.
Cheaper. Less knowledge but good performance in key areas.
The LLM zoo 2/6
• OpenAI models:
• ChatGPT is just the chat interface; many different models!
• Non-reasoning models:
• GPT-4o: Older model, good for fun stuff (chatting, images), free (limited use).
• GPT-4o-mini: Small model, not as good, free.
• GPT-4.1: Improved model, accessible only via API.
• GPT-4.5: Largest model, broader knowledge, higher EQ, paid only.
• Reasoning models:
• o3: Best model available, PhD+ level in all disciplines, problem solving, complex
coding; paid only. o3-pro available soon, pro plan only.
• o4-mini: Newer but smaller model, free (limited use). Do not confuse with 4o-mini!
• o4-mini-high: Thinks longer, paid only.
The LLM zoo 3/6
• OpenAI capabilities (vary by model, not all free):
• Text: Chat, read, analyze, edit, generate, summarize, translate…
• Code: Read, edit, debug, generate, refactor, explain, write tests…
• Data (CSV / JSON / Excel / etc.): Parse, analyze, extract, visualize…
• Science/math: Explain, teach, reason, solve problems at PhD+ level…
• Image: See, edit, generate; photos, art, slides, diagrams…
• Voice: Listen, generate, “advanced voice”…
• Video: Watch, camera/screen sharing…
• Tool use: Autonomously search the web, run Python code…
• Custom instructions, memory…
• Context windows: 128K for GPT-4o, GPT-4.5, o3, o4-mini, 1M for GPT-4.1.
The LLM zoo 4/6
• OpenAI other features:
• Deep Research: Spends up to 30 mins collecting data from dozens of
resources. Specialized early o3. Free (limited use of light version).
• Create custom GPTs that follow specific instructions. Paid only, can be
used by free.
• Create recurrent tasks that run autonomously. Paid only.
• Sora: Generate videos based on script. Paid only.
• Operator: Browse the web and perform tasks. Pro plan only.
• Plus plan: 20 USD/month, pro plan: 200 USD/month.
The LLM zoo 5/6
• Google models:
• Gemini 2.5 Pro: Latest model, reasoning. Free (limited use).
• Gemini 2.5 Flash: Smaller model, reasoning. Free (limited use).
• Capabilities: Text, code, data, science/math, image, voice, video, code
execution, web search, custom “gems”, Deep Research, Google app
integration, 1M context window.
• Gemini Advanced plan: 27 CAD/month.
The LLM zoo 6/6
• Anthropic models:
• Claude 3.7 Sonnet: Latest model, non-reasoning. Free (limited use).
• Claude 3.7 Sonnet Thinking: Reasoning variant. Paid only.
• Capabilities: Text, code, data, science/math, image input only. 200K
context window.
• No voice, video, code execution, custom bots. Web search in US only.
• Pro plan: 17 USD/month, max plan: 100-200 USD/month.
Benchmarks 1/2
• GPQA Diamond: Graduate-Level Google-Proof Q&A.
• 198 PhD-level MCQ in biology, physics, chemistry.
• Humans + web access: 22%.
• Gemini 2.5 Pro: 84%.
• OpenAI o3: 83%.
• OpenAI o4-mini-high: 78%.
• Claude 3.7 Sonnet Thinking: 77%.
• OpenAI GPT-4.5: 71%.
• Gemini 2.5 Flash: 70%.
Benchmarks 2/2
• Humanity’s Last Exam: 2,500 PhD-level MC or short answer
questions in math, physics, biology, medicine, humanities, social
science, computer science, engineering, chemistry, and more.
• Leading expert human (estimate): 6-8% (mostly MC guesses).
• OpenAI o3: 20%.
• OpenAI o4-mini-high: 18%.
• Gemini 2.5 Pro: 17%.
• Gemini 2.5 Flash: 12%.
• Claude 3.7 Sonnet Thinking: 10%.
Common misconceptions 1/7
• Misconception: “LLMs only predict the next token, so they can’t do
_____”.
• Rebuttals:
• How else would you write text?
• Humans also predict the next token.
• By predicting the next token in training, neural net encodes syntax,
semantics, world knowledge, etc.
• Analogy: predict next move in chess.
Common misconceptions 2/7
• Misconception: “LLMs just memorize the entire Internet; no better
than Googling”.
• Rebuttals:
• High scores in Google-proof benchmarks.
• Neural net doesn’t store Internet pages verbatim; uses them to internalize
general meaning.
Common misconceptions 3/7
• Misconception: “LLMs can’t even multiply two numbers”.
• Rebuttals:
• Was true of ChatGPT in 2023. Now LLMs just use Python to multiply.
• Most humans would use a calculator.
• Training doesn’t focus on this skill; could theoretically be improved, but
there’s no need.
Common misconceptions 4/7
• Misconception: “I tried an LLM once and it couldn’t do _____”.
• Rebuttal:
• Was true of ChatGPT in 2023. Now LLMs are much more capable.

• Misconception: “I tried an LLM today and it couldn’t do _____”.

• Rebuttal:
• Choose the right model. GPT-4o mini can’t do much, o3 can do a lot.
Common misconceptions 5/7
• Misconception: “LLMs can’t learn anything new”.
• Rebuttals:
• Latest models learn on the spot: read documents, web search.
• Examples:
• First upload docs/manual/tutorial, then ask question.
• Enable search to get up-to-date info (o3/GPT-4.5 enable by default).
Common misconceptions 6/7
• Misconception: “Students cannot use LLMs to cheat on this
assessment because it _____”.
• Rebuttals:
• Requires private course material? Upload notes/slides.
• Requires vision? No problem.
• Requires clicking with the mouse? Operator / Claude computer use.
• Requires higher-order reasoning? PhD-level reasoning models.
• Randomizes numbers? No problem.
• Is Google-proof? See earlier slide.
• Is proctored remotely? Interview Coder app, or just use phone.
• Is timed? LLMs are faster than humans.
• Is run through AI detector? Doesn’t work.
• Is proctored in person? Earpiece + smart glasses (soon).
Common misconceptions 7/7
• Misconception: “Student LLM use hurts learning outcomes”.
• Rebuttals:
• Very true if student just cuts and pastes!
• Numerous studies show LLM tutors improve student performance (in
exams with no LLM available).
• My own experience:
• Students said my AI chatbot helped a lot.
• ASTR 1P02 students who used the chatbot got 6 points higher grades on average.
• (Note: Does not imply causation.)
Create your own chatbot 1/2
• Easiest way: custom GPTs.
• Gemini gems cannot be shared.
• Create custom instructions: course material, pedagogical
preferences, logistics…
• Limitations:
• Must have paid plan.
• Free users must use GPT-4o mini, an inferior model.
• Even paid users must use GPT-4o, not a reasoning model.
• Privacy concerns.
Create your own chatbot 2/2
• Harder way: use an API.
• Much more customizable, including choice of model.
• Students can use for free.
• Limitations:
• Must have good programming skills (HTML/CSS/JavaScript/Python).
• Must host own website.
• Must pay whenever it’s used (per token).
• Workaround: Gemini 2.5 Flash (500 req/day free)!
Easy way:
Interface: ChatGPT
Model: GPT-4o/4o-mini
Hard way:
Interface: My own website
Model: Gemini 2.5 Flash (API)
Part III: Future
To be discussed in
another workshop…

Any questions?

Sinan Ozdemir - Quick Start Guide To Large Language Models, Second Edition-Addison-Wesley (2024)
No ratings yet
Sinan Ozdemir - Quick Start Guide To Large Language Models, Second Edition-Addison-Wesley (2024)
279 pages
AI Roadmap - Based On Stanford AI Graduate Certificate
No ratings yet
AI Roadmap - Based On Stanford AI Graduate Certificate
16 pages
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
100% (5)
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
326 pages
Hands-On Deep Learning For Images With T PDF
No ratings yet
Hands-On Deep Learning For Images With T PDF
3 pages
LLM Model
No ratings yet
LLM Model
3 pages
Icaps LLM Tut Slides Posted
No ratings yet
Icaps LLM Tut Slides Posted
97 pages
What Are LLMs
No ratings yet
What Are LLMs
3 pages
Paniit Demystifying Llms
No ratings yet
Paniit Demystifying Llms
66 pages
Attention Is All You Need.
No ratings yet
Attention Is All You Need.
5 pages
Large Language Model (LLM) 1
100% (1)
Large Language Model (LLM) 1
17 pages
The Best LLMs Cheatsheet - Part 1
No ratings yet
The Best LLMs Cheatsheet - Part 1
16 pages
Buildinwg A PDF Knowledge Bot With Open-Source LLMs - A Step-by-Step Guide - Shakudo
No ratings yet
Buildinwg A PDF Knowledge Bot With Open-Source LLMs - A Step-by-Step Guide - Shakudo
9 pages
Thoughts On NLP Research in The (Post-) LLM Era: Yijia Shao Yuanpei College 2023/04/28
No ratings yet
Thoughts On NLP Research in The (Post-) LLM Era: Yijia Shao Yuanpei College 2023/04/28
51 pages
Understanding Large Language Models (LLMS) - A Mode
No ratings yet
Understanding Large Language Models (LLMS) - A Mode
3 pages
Techniques, Tricks & Frameworks
No ratings yet
Techniques, Tricks & Frameworks
143 pages
Week4 LLMs EN
No ratings yet
Week4 LLMs EN
48 pages
21046
No ratings yet
21046
38 pages
AI Tools
No ratings yet
AI Tools
19 pages
Training Large Language Models
No ratings yet
Training Large Language Models
7 pages
Large Language Models Johns Hopkins University
No ratings yet
Large Language Models Johns Hopkins University
54 pages
How LLM's Work, How GPT Was Trained, and How GPT Generates Outputs
No ratings yet
How LLM's Work, How GPT Was Trained, and How GPT Generates Outputs
12 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
Exploring The Evolution of Large Language Models: Architectures, Applications, and Future Directions
No ratings yet
Exploring The Evolution of Large Language Models: Architectures, Applications, and Future Directions
11 pages
《A Primer on Large Language Models and their Limitations
No ratings yet
《A Primer on Large Language Models and their Limitations
33 pages
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
No ratings yet
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
325 pages
1st Note
No ratings yet
1st Note
3 pages
LLMS&EMBEDDINGS
No ratings yet
LLMS&EMBEDDINGS
10 pages
Generative AI and LLMS
No ratings yet
Generative AI and LLMS
34 pages
Week #2 Module - Large Language Models - UPOU MODeL
No ratings yet
Week #2 Module - Large Language Models - UPOU MODeL
10 pages
To Create A LLM
No ratings yet
To Create A LLM
53 pages
AILLM
No ratings yet
AILLM
3 pages
A Comprehensive Overview of Large Language Models: Preprint 1
No ratings yet
A Comprehensive Overview of Large Language Models: Preprint 1
46 pages
Module1 L4 LLMs New
No ratings yet
Module1 L4 LLMs New
37 pages
Presentation On Ai
No ratings yet
Presentation On Ai
10 pages
Mod 4
No ratings yet
Mod 4
69 pages
ML A Deep Dive in The World of AI and LLM Tun'Up Munich - 241021 - 130023
No ratings yet
ML A Deep Dive in The World of AI and LLM Tun'Up Munich - 241021 - 130023
34 pages
Generative AI Exists Because of The Transformer
No ratings yet
Generative AI Exists Because of The Transformer
52 pages
Python BAKMR010399001
No ratings yet
Python BAKMR010399001
3 pages
Week 6 Ai Llms Gpts
No ratings yet
Week 6 Ai Llms Gpts
17 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
LLM 1 GPT
No ratings yet
LLM 1 GPT
12 pages
LLM Presentation
No ratings yet
LLM Presentation
10 pages
Understanding LLMS: A Comprehensive Overview From Training To Inference
No ratings yet
Understanding LLMS: A Comprehensive Overview From Training To Inference
30 pages
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
No ratings yet
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
254 pages
Toc 9780138199302
No ratings yet
Toc 9780138199302
8 pages
LLM - Michael R Douglas
No ratings yet
LLM - Michael R Douglas
47 pages
LLMs
No ratings yet
LLMs
10 pages
Understanding LLMS: A Comprehensive Overview From Training To Inference
No ratings yet
Understanding LLMS: A Comprehensive Overview From Training To Inference
30 pages
Aryan A. What Is LLMOps. Large Language Models in Production 2024
100% (1)
Aryan A. What Is LLMOps. Large Language Models in Production 2024
67 pages
Large Language Models
No ratings yet
Large Language Models
3 pages
DAB311 DL Week 11 RNN
No ratings yet
DAB311 DL Week 11 RNN
25 pages
BCS Document
No ratings yet
BCS Document
6 pages
Large Large Models
No ratings yet
Large Large Models
25 pages
SSRN Id4655822
No ratings yet
SSRN Id4655822
9 pages
A Comprehensive Overview of Large Language Models - 2307.06435v9
No ratings yet
A Comprehensive Overview of Large Language Models - 2307.06435v9
46 pages
Industrial Applications of Large Language Models
No ratings yet
Industrial Applications of Large Language Models
23 pages
Day 2 Module 2 - Understanding LLMs
No ratings yet
Day 2 Module 2 - Understanding LLMs
14 pages
LLM Basics
No ratings yet
LLM Basics
3 pages
Robotics - PPT For Ros Etc Students Good
No ratings yet
Robotics - PPT For Ros Etc Students Good
15 pages
Large Language Models and Their Use Cases
No ratings yet
Large Language Models and Their Use Cases
3 pages
Small Language Models (SLMS)
No ratings yet
Small Language Models (SLMS)
23 pages
Algorithm Challenges: The Dojo Collection
From Everand
Algorithm Challenges: The Dojo Collection
Martin Puryear
No ratings yet
SAR AI Paper
No ratings yet
SAR AI Paper
26 pages
Image Caption Bot With Keras and Speech Generation For
No ratings yet
Image Caption Bot With Keras and Speech Generation For
7 pages
Assignment Week 11-Deep-Learning PDF
100% (2)
Assignment Week 11-Deep-Learning PDF
7 pages
Multi-Class Retinal Diseases Detection Using Deep CNN With Minimal Memory Consumption PDF
100% (1)
Multi-Class Retinal Diseases Detection Using Deep CNN With Minimal Memory Consumption PDF
11 pages
Dental X-Ray Image Segmenation Using A U-Shaped Deep Convolutional Network
No ratings yet
Dental X-Ray Image Segmenation Using A U-Shaped Deep Convolutional Network
13 pages
Research Papers On Python
No ratings yet
Research Papers On Python
6 pages
Foundations of Machine Learning: Assignment 1: Problem 1
No ratings yet
Foundations of Machine Learning: Assignment 1: Problem 1
2 pages
Call of Chapter
No ratings yet
Call of Chapter
1 page
INTROTOARTIFICIAL
No ratings yet
INTROTOARTIFICIAL
2 pages
Tutorial1 T03 1
No ratings yet
Tutorial1 T03 1
17 pages
Neural - N - Problems - SLP
No ratings yet
Neural - N - Problems - SLP
123 pages
Class XI Unsolved Question&Answers (Part B-Unit-1 To 4)
No ratings yet
Class XI Unsolved Question&Answers (Part B-Unit-1 To 4)
13 pages
Stock Prediction Using Deep Learning
No ratings yet
Stock Prediction Using Deep Learning
16 pages
Pattern Recognition Linear Classifier by Zaheer Ahmad
0% (1)
Pattern Recognition Linear Classifier by Zaheer Ahmad
37 pages
AI Model For Iris Species Prediction Using Logistic Regression Algorithm-1
No ratings yet
AI Model For Iris Species Prediction Using Logistic Regression Algorithm-1
5 pages
Cogvlm Paper
No ratings yet
Cogvlm Paper
18 pages
GAN Review - Models and Medical Image Fusion Applications
No ratings yet
GAN Review - Models and Medical Image Fusion Applications
15 pages
PPO (v3)
No ratings yet
PPO (v3)
29 pages
Assignment 4 - CSE - AI - 2
No ratings yet
Assignment 4 - CSE - AI - 2
6 pages
Enhancing Neural Network Models For MNIST Digit Recognition
No ratings yet
Enhancing Neural Network Models For MNIST Digit Recognition
6 pages
DL Assignment 5
No ratings yet
DL Assignment 5
4 pages
B03 AI Project Cycle - 02 Mark Questions
No ratings yet
B03 AI Project Cycle - 02 Mark Questions
4 pages
Module 5 IMLA QB 6677
No ratings yet
Module 5 IMLA QB 6677
2 pages
AI Glossary
No ratings yet
AI Glossary
305 pages
Neural Network Toolbox Command List
No ratings yet
Neural Network Toolbox Command List
4 pages
Mathematics of Neural Networks: Bart M.N. Smets November 12, 2022
No ratings yet
Mathematics of Neural Networks: Bart M.N. Smets November 12, 2022
80 pages
EEGFormer Towards Transferable and Interpretable Large-Scale
No ratings yet
EEGFormer Towards Transferable and Interpretable Large-Scale
6 pages
Review - UNet++ - A Nested U-Net Architecture (Biomedical Image Segmentation) - by Sik-Ho Tsang - Medium
No ratings yet
Review - UNet++ - A Nested U-Net Architecture (Biomedical Image Segmentation) - by Sik-Ho Tsang - Medium
9 pages

Know Thy Frenemy

Uploaded by

Know Thy Frenemy

Uploaded by

Know Thy Frenemy:

“A kitten playing with a

• Misconception: “I tried an LLM today and it couldn’t do _____”.

You might also like