Presentation 11

Large Language Models (LLMs) are computational models for natural language processing that utilize unsupervised and semi-supervised learning to understand and generate text. The evolution of LLMs has progressed from statistical language modeling to advanced models like GPT-4, with various types such as autoregressive models, masked language models, and encoder-decoder models serving different use cases. Future applications of LLMs include personal AI assistants, medical advisors, and business analytics.

Uploaded by

atoinfinitya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views20 pages

Presentation 11

Uploaded by

atoinfinitya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

LLM

Introduction to Large Language Models (LLMs)

Definition
• A type of computational model designed for natural
language processing tasks such as language generation
• Applies Unsupervised learning and Semi-supervised
learning
• Learns statistical relationships between large amount of
texts
History
• Start from statistical language modelling by IBM in 2001
• To Neural Machine Translation by Google in 2016
• To GPT1 of OpenAI in 2019
• To GPT4 in 2023
Types of LLM and Use cases
• Autoregressive Models
Definition: Predict the next token based on previously generated tokens.
Examples: GPT series (GPT-3, GPT-4), LLaMA (Meta).
Use cases: Text generation and sequential tasks.

• Masked Language Models (MLMs)

Definition: Predict masked (hidden) tokens in a sentence by analyzing the entire context.
Examples: BERT (Bidirectional Encoder Representations from Transformers), RoBERTa
Use cases: Excellent at understanding context and relationships in text, good for classification and
question answering.

• Encoder-Decoder Models (Seq2Seq)

Definition: Combine both encoding input and decoding output to generate context-aware sequences.
Examples: T5 (Text-to-Text Transfer Transformer), BART (Bidirectional and Auto-Regressive
Transformers)
Use Cases: Designed for tasks like text summarization, translation, and question answering.
Architecture – Workflow
• Large datasets of words are processed
• The processed data goes through embedding layer(s)
• Embedded data are trained on Transformer Layers
• Self Attention Mechanism is applied on each set of layers for understanding context
• Masking is applied for different use cases
• Optimization is done on the completed model
• Output are decoded and sorted into a dictionary
Architecture – Process data
• Tokenization
• Encoding
• Clean Data
• Synthesize data
• Fine-tune data
Process data - Tokenization
• Tokenize words into numbers so the algorithm can
understand
Process data - Encoding
• Applies Positional Encoding to embed the position of
each token in the input sequence.
• Ensures the model understands the order of words.
Positional Encoding - Formula

• Q represents the Query vector.

• K represents the Key vector.
• V represents the Value vector.
• d is the dimension of the key/query vectors.

• Query: Used to match relevant information.

• Key: Enables finding matching queries.
• Value: The actual information passed based on the
attention weight.
Architecture – Embedding layers
• Embeddings words are dense vectors that words or
tokens are converted into before being processed.
• Vectors are multi-dimensional to capture semantic
similarities between different vocabulary field.
Architecture - Training
• Multiple Stacked Transformer Layers that are either:
• Self-Attention Layers: Focus on Specific parts of the input for relevancy
• Or Feed Forward Neural Network: Using Non-Linear Regression, take output of Self-
Attention Layers
• Short Cut between layers to minimize Information Loss
Architecture – Self Attention
Mechanism
• Multi-Head Attention: Allow the model to learn
different aspects of word relationships in parallel.
• Each head performs a separate attention operation then
combines results.
Architecture - Masking
• Bidirectional Attention (BERT): The model looks at both
left and right of a word to predict masked tokens.
• Autoregressive Attention (GPT): The model only looks at
prior words when generating text, useful for predicting
the next token.
Architecture – Output processing
• Decoder to turn tokens back to words
• Soft-max of multiple possible words
• Vocabulary and Tokenization: A fixed vocabulary size
(often 30,000–50,000 tokens), split into sub-word tokens
to handle rare words and different languages effectively.
Architecture - Optimization
• Optimizers: Adam, AdamW on attention weights and
back propagation
Scale
• GPT4 has 10x the parameters from GPT3
• GPT3 and Llama were trained on a much later dataset
that include new sources at the time: websites, social
medias, online contents,...
Research papers
• GPT: https://arxiv.org/abs/2303.08774

• PaLM: https://arxiv.org/abs/2204.02311

• LLaMA: https://arxiv.org/abs/2302.13971
Github Links
• OpenLLM: https://github.com/bentoml/OpenLLM

• OpenLLaMA: https://github.com/openlm-research/open_llama

• GPT: https://github.com/openai/openai-cookbook
Comparing LLM models
Future Uses of LLMs
• Personal AI Assistants

• Medical Advisors

• Business Analytics

Briot Et Al (2020) - Deep Learning Techniques For Music Generation
No ratings yet
Briot Et Al (2020) - Deep Learning Techniques For Music Generation
303 pages
Reinforcement Learning
100% (1)
Reinforcement Learning
64 pages
Whitepaper - Foundational Large Language Models & Text Generation
100% (2)
Whitepaper - Foundational Large Language Models & Text Generation
75 pages
OceanofPDF - Com Large Language Models Concepts - John AtkinsonAbutridy
No ratings yet
OceanofPDF - Com Large Language Models Concepts - John AtkinsonAbutridy
185 pages
The Best LLMs Cheatsheet - Part 1
No ratings yet
The Best LLMs Cheatsheet - Part 1
16 pages
Slides
No ratings yet
Slides
137 pages
00-'Congratulations! You're In... '
No ratings yet
00-'Congratulations! You're In... '
1 page
Techniques, Tricks & Frameworks
No ratings yet
Techniques, Tricks & Frameworks
143 pages
Generative AI With LArge Language Models
No ratings yet
Generative AI With LArge Language Models
36 pages
Large Language Model
No ratings yet
Large Language Model
49 pages
Mod 4
No ratings yet
Mod 4
69 pages
LLM and Gen AI
No ratings yet
LLM and Gen AI
4 pages
Updated Masterclass Curriculum-2
No ratings yet
Updated Masterclass Curriculum-2
35 pages
AI in Agriculture
No ratings yet
AI in Agriculture
12 pages
Generative AI Unit 3 Notes
No ratings yet
Generative AI Unit 3 Notes
8 pages
Aiml Class PPT Unit 1
No ratings yet
Aiml Class PPT Unit 1
131 pages
AI Tools
No ratings yet
AI Tools
19 pages
L09 Using Matlab Neural Networks Toolbox
100% (1)
L09 Using Matlab Neural Networks Toolbox
34 pages
Day 1
No ratings yet
Day 1
32 pages
LLM Intro
No ratings yet
LLM Intro
8 pages
2025 04 22 Intro To LLMsv1
No ratings yet
2025 04 22 Intro To LLMsv1
41 pages
14.chapter10 AdvancedDeepLearningForText
No ratings yet
14.chapter10 AdvancedDeepLearningForText
22 pages
D 02 Large Language Models
100% (1)
D 02 Large Language Models
58 pages
Week4 LLMs EN
No ratings yet
Week4 LLMs EN
48 pages
+500 Original and Irresistible Openers For Online Dating
100% (1)
+500 Original and Irresistible Openers For Online Dating
28 pages
Advanced Techniques in Training and Applying Large Language Models
No ratings yet
Advanced Techniques in Training and Applying Large Language Models
6 pages
SESSION 1 LLMs
No ratings yet
SESSION 1 LLMs
40 pages
Prompt Library
No ratings yet
Prompt Library
4 pages
1 PB
No ratings yet
1 PB
3 pages
Guide Large Language Models How Intelligent Document Processing Can Leverage The Likes of GPT X
No ratings yet
Guide Large Language Models How Intelligent Document Processing Can Leverage The Likes of GPT X
15 pages
AI Primer
No ratings yet
AI Primer
12 pages
CS480 Lecture November 28th
No ratings yet
CS480 Lecture November 28th
96 pages
15 AI Use Cases: in Government
No ratings yet
15 AI Use Cases: in Government
77 pages
2025 P12 The Architecture of Language Understanding The Mechanics Behind Llms
No ratings yet
2025 P12 The Architecture of Language Understanding The Mechanics Behind Llms
19 pages
BTech Advanced AI Unit03
No ratings yet
BTech Advanced AI Unit03
109 pages
To Create A LLM
No ratings yet
To Create A LLM
53 pages
A M3 RD Ipjn Yd Ps GKF
No ratings yet
A M3 RD Ipjn Yd Ps GKF
20 pages
Class Note For Machine Learning at University
No ratings yet
Class Note For Machine Learning at University
58 pages
Module1 L5 GPT Variants
No ratings yet
Module1 L5 GPT Variants
7 pages
03 NLP Document
No ratings yet
03 NLP Document
38 pages
Module1 L4 LLMs New
No ratings yet
Module1 L4 LLMs New
37 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
Definition:: Large Language Models (LLMS)
No ratings yet
Definition:: Large Language Models (LLMS)
41 pages
LLMS&EMBEDDINGS
No ratings yet
LLMS&EMBEDDINGS
10 pages
Large Language Models For Information Management - 01 - Modulo Base (MB) - 4pdf
No ratings yet
Large Language Models For Information Management - 01 - Modulo Base (MB) - 4pdf
68 pages
Complete NLP Guide - From Fundamentals To Deep Learning With TensorFlow
No ratings yet
Complete NLP Guide - From Fundamentals To Deep Learning With TensorFlow
13 pages
DAB311 DL Week 11 RNN
No ratings yet
DAB311 DL Week 11 RNN
25 pages
4-HC24.PrimisAI - Hans Bouwmeester.v4
No ratings yet
4-HC24.PrimisAI - Hans Bouwmeester.v4
29 pages
Thoughts On NLP Research in The (Post-) LLM Era: Yijia Shao Yuanpei College 2023/04/28
No ratings yet
Thoughts On NLP Research in The (Post-) LLM Era: Yijia Shao Yuanpei College 2023/04/28
51 pages
ICT-202 Machine Learning Assignment-1 Literature Work (Muaaz)
No ratings yet
ICT-202 Machine Learning Assignment-1 Literature Work (Muaaz)
5 pages
LLMs
No ratings yet
LLMs
40 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
Emerging Architectures For LLM Applications - Andreessen Horowitz
No ratings yet
Emerging Architectures For LLM Applications - Andreessen Horowitz
15 pages
Large Language Models: Dr. Asgari, Dr. Rohban, Soleymani Fall 2023
No ratings yet
Large Language Models: Dr. Asgari, Dr. Rohban, Soleymani Fall 2023
53 pages
CH 10 Image Segmentation: Ideally, Partition An Image Into Regions Corresponding To Real World Objects
No ratings yet
CH 10 Image Segmentation: Ideally, Partition An Image Into Regions Corresponding To Real World Objects
21 pages
AI and Prompt
No ratings yet
AI and Prompt
18 pages
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
No ratings yet
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
15 pages
Pranay Report
No ratings yet
Pranay Report
26 pages
LLM Review
No ratings yet
LLM Review
16 pages
Ec3501 Wireless Communication 836516061 WC Notes PDF
No ratings yet
Ec3501 Wireless Communication 836516061 WC Notes PDF
6 pages
GenAI Syllabus
No ratings yet
GenAI Syllabus
17 pages
AIDL
No ratings yet
AIDL
2 pages
Large Language Models A Comprehensive Survey of It
No ratings yet
Large Language Models A Comprehensive Survey of It
30 pages
LLM Presentation
No ratings yet
LLM Presentation
10 pages
Wipo Ai Inventions Factsheet
No ratings yet
Wipo Ai Inventions Factsheet
10 pages
Large Language Models
No ratings yet
Large Language Models
10 pages
Digital Image Processing Computer Vision: Marya Butt, PHD
No ratings yet
Digital Image Processing Computer Vision: Marya Butt, PHD
30 pages
MCA Leaflet
No ratings yet
MCA Leaflet
2 pages
Exploring The Evolution of Large Language Models: Architectures, Applications, and Future Directions
No ratings yet
Exploring The Evolution of Large Language Models: Architectures, Applications, and Future Directions
11 pages
LLM Research Paper
No ratings yet
LLM Research Paper
2 pages
Unit 4 LLM
No ratings yet
Unit 4 LLM
11 pages
LLM 1
No ratings yet
LLM 1
6 pages
Inset 2023 Presentation
No ratings yet
Inset 2023 Presentation
38 pages
Large Language Models and Their Use Cases
No ratings yet
Large Language Models and Their Use Cases
3 pages
The Diverse Landscape of Large Language Models Deepsense Ai
No ratings yet
The Diverse Landscape of Large Language Models Deepsense Ai
16 pages
Chapter 5, PT 1
No ratings yet
Chapter 5, PT 1
34 pages
Course Outline - NIA - Intelligent Government Based On AI - Vfinal
No ratings yet
Course Outline - NIA - Intelligent Government Based On AI - Vfinal
6 pages
Prompt Engr Module 5
No ratings yet
Prompt Engr Module 5
4 pages
L11.2 Prob Models em
No ratings yet
L11.2 Prob Models em
20 pages
Generative AI For Everyone: Doç. Dr. Murat Mühendislik Fakültesi, Bilgisayar, Gazi Üniversitesi, E-Mail: My Gazi - Edu.tr
No ratings yet
Generative AI For Everyone: Doç. Dr. Murat Mühendislik Fakültesi, Bilgisayar, Gazi Üniversitesi, E-Mail: My Gazi - Edu.tr
44 pages
Presentation 4
No ratings yet
Presentation 4
4 pages
Presentation 8
No ratings yet
Presentation 8
4 pages
Presentation 7
No ratings yet
Presentation 7
4 pages
1 s2.0 S074756322300064X Main
No ratings yet
1 s2.0 S074756322300064X Main
9 pages
Presentation 9
No ratings yet
Presentation 9
3 pages
Ai Security Policy Template
No ratings yet
Ai Security Policy Template
3 pages
Clarifying Dreyfus' Critique of GOFAI's Ontological Assumption - A Formalization
No ratings yet
Clarifying Dreyfus' Critique of GOFAI's Ontological Assumption - A Formalization
13 pages
DL Unit-IV
No ratings yet
DL Unit-IV
20 pages
CSE485 Sheet3 Optimization
No ratings yet
CSE485 Sheet3 Optimization
3 pages
Presentation On Astro Robot As An Example of Automated Robot
No ratings yet
Presentation On Astro Robot As An Example of Automated Robot
4 pages
Journal of Computer Science Science Publications - 2019 - Vol7
No ratings yet
Journal of Computer Science Science Publications - 2019 - Vol7
7 pages
Sign Language To Text Converter
No ratings yet
Sign Language To Text Converter
2 pages
Attention Is All You Need.
No ratings yet
Attention Is All You Need.
5 pages
BERT Foundations and Applications: Definitive Reference for Developers and Engineers
From Everand
BERT Foundations and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet