LLM Cheatsheet

Uploaded by

Thiyagarajan Palaniyappan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

113 views1 page

LLM Cheatsheet

Uploaded by

Thiyagarajan Palaniyappan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

TRANSFORMERS TYPES OF LLMS CONFIGURATION SETTINGS

Introduction to LLMs – Can scale efficiently to use multi-core GPUs

– Can process input data in parallel
Encoder only = Autoencoding model
Ex: BERT, RoBERTa
Parameters to set at inference time

Max new tokens Maximum number of tokens

– Pay attention to all other words These are not generative models. generated during completion
when processing a word
DEFINITIONS
Decoding strategy
Transformers’ strength lies in understanding
Generative AI AI systems that can produce the context and relevance of all words 1 Greedy Decoding The word/token with the

realistic content (text, image, etc.) in a sentence highest probability is selected from the final
To predict tokens masked
PRE-TRAINING OBJECTIVE probability distribution (prone to repetition)
in a sentence (= Masked Language Modeling)

OUTPUT Encoded representation of the text

USE CASE(S) Sentence classification (e.g., NER)

Large Language Models (LLMs)
Large neural networks trained at internet scale Decoder only = Autoregressive model

to estimate the probability of sequences
of words Ex: GPT, BLOOM

Ex: GPT, FLAN-T5, LLaMA, PaLM, BLOOM
2 Random Sampling The model chooses
(transformers with billions of parameters)
an output word at random using the probability
Abilities (and computing resources needed)
distribution to weigh the selection (could be
tend to rise with the number of parameters
To predict the next token
PRE-TRAINING OBJECTIVE too creative)
USE CASES based on the previous sequence of tokens TECHNIQUES TO CONTROL RANDOM SAMPLING
– Standard NLP tasks (= Causal Language Modeling)
– Top K The next token is drawn from
(classification, summarization, etc.) OUTPUT Next token the k tokens with the highest probabilities
– Content generation USE CASES Text generation – Top P The next token is drawn from
– Reasoning (Q&A, planning, coding, etc.) Token Word or sub-word the tokens with the highest probabilities,
The basic unit processed by transformers Encoder-Decoder = Seq-to-seq model whose combined probabilities exceed p

Encoder Processes input sequence Ex: T5, BART

to generate a vector representation (or

embedding) for each token

Decoder Processes input tokens to produce

new tokens
In-context learning Specifying the task
to perform directly in the prompt Embedding layer Maps each token
to a trainable vector

Temperature Influence the shape of
Positional encoding vector the probability distribution through a scaling

Added to the token embedding vector PRE-TRAINING OBJECTIVE Vary from model to model
factor in the softmax layer
to keep track of the token’s position (e.g., Span corruption like T5)

OUTPUT Sentinel token + predicted tokens

Self-Attention Computes the importance

of each word in the input sequence to all USE CASES Translation, Q&A, summarization

other words in the sequence

Cardiologie MANUAL
50% (12)
Cardiologie MANUAL
15 pages
Cheatsheet Transformers Large Language Models
No ratings yet
Cheatsheet Transformers Large Language Models
4 pages
Sample - A Hands-On Guide To Fine-Tuning LLMs
80% (5)
Sample - A Hands-On Guide To Fine-Tuning LLMs
49 pages
LLM Cheat Sheetpdf
No ratings yet
LLM Cheat Sheetpdf
7 pages
Character Ai
No ratings yet
Character Ai
101 pages
Autogen Guide
No ratings yet
Autogen Guide
232 pages
2023 LLMBC LLM Foundations
No ratings yet
2023 LLMBC LLM Foundations
92 pages
Multimodal Deep Learning
No ratings yet
Multimodal Deep Learning
21 pages
Keycloak - CNCF Security SIG - Self Assesment
No ratings yet
Keycloak - CNCF Security SIG - Self Assesment
35 pages
Introduction To Learning: Frederic Precioso 24/01/2019
No ratings yet
Introduction To Learning: Frederic Precioso 24/01/2019
179 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
47 pages
Huggingface Basics
No ratings yet
Huggingface Basics
28 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Peter's Principles (2008) Discover, Jeanne Lenzer
No ratings yet
Peter's Principles (2008) Discover, Jeanne Lenzer
7 pages
Gluon Tutorials: Deep Learning - The Straight Dope
No ratings yet
Gluon Tutorials: Deep Learning - The Straight Dope
403 pages
Agya Ram Verma - Yatendra Kumar - Basic and Advance - Phython Programming-Independently Published (2024)
No ratings yet
Agya Ram Verma - Yatendra Kumar - Basic and Advance - Phython Programming-Independently Published (2024)
240 pages
Large Language ModelBrained GUI Agents
No ratings yet
Large Language ModelBrained GUI Agents
78 pages
Python Notes
No ratings yet
Python Notes
279 pages
2018 Miccai PDF
No ratings yet
2018 Miccai PDF
239 pages
React Js Cheat Sheet
No ratings yet
React Js Cheat Sheet
280 pages
MM-LLMs Recent Advances in MultiModal Large Language Models
No ratings yet
MM-LLMs Recent Advances in MultiModal Large Language Models
22 pages
Quantecon Python Programming
No ratings yet
Quantecon Python Programming
388 pages
Multi-Modal Generative AI Survey
No ratings yet
Multi-Modal Generative AI Survey
23 pages
Cell-Mediated & Humoral Immunity Pathology
No ratings yet
Cell-Mediated & Humoral Immunity Pathology
65 pages
I Think Unix
No ratings yet
I Think Unix
299 pages
Evaluating LLM Models For Production Systems - Methods and Practices - Data Phoenix
No ratings yet
Evaluating LLM Models For Production Systems - Methods and Practices - Data Phoenix
61 pages
Yourfirstweekwithreact 2 Ndedition
No ratings yet
Yourfirstweekwithreact 2 Ndedition
177 pages
Factitious Disorder
No ratings yet
Factitious Disorder
3 pages
Lecture8 PDF
No ratings yet
Lecture8 PDF
434 pages
Performance Analysis of LoRA Finetuning Llama-2
No ratings yet
Performance Analysis of LoRA Finetuning Llama-2
4 pages
Internet Architecture and Performance Metrics
No ratings yet
Internet Architecture and Performance Metrics
14 pages
LCM LoRA Technical Report
No ratings yet
LCM LoRA Technical Report
7 pages
Theory Lectures v2.3
No ratings yet
Theory Lectures v2.3
264 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
UNIX For Users Handout
0% (1)
UNIX For Users Handout
129 pages
5G NTN Slicing
No ratings yet
5G NTN Slicing
9 pages
Cheat Sheet
No ratings yet
Cheat Sheet
81 pages
Hype User Manual
No ratings yet
Hype User Manual
35 pages
Agents & Environment
No ratings yet
Agents & Environment
24 pages
Synchronization Algorithms and Concurrent Programming
No ratings yet
Synchronization Algorithms and Concurrent Programming
74 pages
Network Slicing and Softwarization - A Survey On Principles - Enabling Technologies - and Solutions
No ratings yet
Network Slicing and Softwarization - A Survey On Principles - Enabling Technologies - and Solutions
24 pages
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages
Hugging Face Transformers
No ratings yet
Hugging Face Transformers
8 pages
BlazeMeter Test Data Fundamentals Synthetic Examples
No ratings yet
BlazeMeter Test Data Fundamentals Synthetic Examples
8 pages
Introduction To AlphaFold RCS 2022
No ratings yet
Introduction To AlphaFold RCS 2022
36 pages
Prompt Engineering Notes
No ratings yet
Prompt Engineering Notes
2 pages
Deep Learning For Intelligent and Automated Network Slicing in 5G Open RAN ORAN Deployment
No ratings yet
Deep Learning For Intelligent and Automated Network Slicing in 5G Open RAN ORAN Deployment
7 pages
Nn4ir PDF
No ratings yet
Nn4ir PDF
290 pages
Why+learn+Generative+AI
No ratings yet
Why+learn+Generative+AI
10 pages
Python Programming
No ratings yet
Python Programming
7 pages
Chatgpt'S Scientific Writings: A Case Study On Traffic Safety
No ratings yet
Chatgpt'S Scientific Writings: A Case Study On Traffic Safety
17 pages
LangChain Programming For Beginners
No ratings yet
LangChain Programming For Beginners
154 pages
GMX - BD Rhapsody Single Cell Analysis System - BR - EN
No ratings yet
GMX - BD Rhapsody Single Cell Analysis System - BR - EN
8 pages
NCA-GENL Nvidia Generative Ai Llms Exam Dumps
No ratings yet
NCA-GENL Nvidia Generative Ai Llms Exam Dumps
5 pages
SNOMED CT Release File Specifications
No ratings yet
SNOMED CT Release File Specifications
282 pages
Text Generation
No ratings yet
Text Generation
4 pages
Introduction To LLMS: Transformers Types of Llms Configuration Settings
100% (2)
Introduction To LLMS: Transformers Types of Llms Configuration Settings
7 pages
The Diverse Landscape of Large Language Models Deepsense Ai
No ratings yet
The Diverse Landscape of Large Language Models Deepsense Ai
16 pages
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Audio Visual Speech Recognition: Advancements, Applications, and Insights
From Everand
Audio Visual Speech Recognition: Advancements, Applications, and Insights
Fouad Sabry
No ratings yet
Perceptual Computing: Fundamentals and Applications
From Everand
Perceptual Computing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Chapter 18: C++ As A Better C Introducing Object Technology
No ratings yet
Chapter 18: C++ As A Better C Introducing Object Technology
23 pages
Book Sizes
No ratings yet
Book Sizes
9 pages
CASA - Advisory-Circular-119-12-Human-Factors-Principles-Non-Technical-Skills-Training-Assessment-Air-Transport-Operations
No ratings yet
CASA - Advisory-Circular-119-12-Human-Factors-Principles-Non-Technical-Skills-Training-Assessment-Air-Transport-Operations
34 pages
Masafi
No ratings yet
Masafi
2 pages
Egyptian Heaven and Hell Volume II
No ratings yet
Egyptian Heaven and Hell Volume II
314 pages
Iot Sem 5
No ratings yet
Iot Sem 5
45 pages
Paidout Policies
No ratings yet
Paidout Policies
2 pages
8 D Report Format
No ratings yet
8 D Report Format
9 pages
Order Now Whatsapp: Course: Teacher Education in Pakistan (8626) Semester: Spring, 2023 Level: B.Ed. (1.5 Years)
No ratings yet
Order Now Whatsapp: Course: Teacher Education in Pakistan (8626) Semester: Spring, 2023 Level: B.Ed. (1.5 Years)
14 pages
SSC Cpo
No ratings yet
SSC Cpo
1 page
Of Plymouth Plantation PDF
100% (2)
Of Plymouth Plantation PDF
4 pages
Math 110-Fundamentals
No ratings yet
Math 110-Fundamentals
52 pages
Framo Pumps
No ratings yet
Framo Pumps
5 pages
To 15a8-4-10-3 Navair 03-30ak-103
No ratings yet
To 15a8-4-10-3 Navair 03-30ak-103
42 pages
Cefasabal Underland - 2011 - CAM Reviews Serenoa Repens For Benign Prostatic Hyperplasia-2
No ratings yet
Cefasabal Underland - 2011 - CAM Reviews Serenoa Repens For Benign Prostatic Hyperplasia-2
2 pages
Venkat - AEM Developer
No ratings yet
Venkat - AEM Developer
4 pages
Installation Instructions: HONDA CBR 1000 RR 08-18 HONDA CBR 1000 RR SP 17-18
No ratings yet
Installation Instructions: HONDA CBR 1000 RR 08-18 HONDA CBR 1000 RR SP 17-18
12 pages
Subdivision Warranty Bond
No ratings yet
Subdivision Warranty Bond
2 pages
1 Datasheet Solis-3P10K-4G
No ratings yet
1 Datasheet Solis-3P10K-4G
2 pages
CTPAT Job Aid - Personnel Training Checklist Sample - October 2021
No ratings yet
CTPAT Job Aid - Personnel Training Checklist Sample - October 2021
4 pages
True Blue Geography
No ratings yet
True Blue Geography
5 pages
Cisco Script
No ratings yet
Cisco Script
2 pages
Plastic University MCQ Merged
No ratings yet
Plastic University MCQ Merged
13 pages
Vam Top HC Torque Table
100% (1)
Vam Top HC Torque Table
9 pages
Frequently Asked Questions (Faqs) About The: Symmetry454 and Symmetry010 Calendars
No ratings yet
Frequently Asked Questions (Faqs) About The: Symmetry454 and Symmetry010 Calendars
17 pages
3.2. Perspectives On Listening Ho
No ratings yet
3.2. Perspectives On Listening Ho
35 pages
Lecture 8 - Transport Layer
No ratings yet
Lecture 8 - Transport Layer
50 pages
DE-13 - Quiz 8
No ratings yet
DE-13 - Quiz 8
2 pages
MaheswariVeni Auth Nagercoil
No ratings yet
MaheswariVeni Auth Nagercoil
2 pages

LLM Cheatsheet

Uploaded by

LLM Cheatsheet

Uploaded by

TRANSFORMERS TYPES OF LLMS CONFIGURATION SETTINGS

Introduction to LLMs – Can scale efficiently to use multi-core GPUs

Max new tokens Maximum number of tokens

You might also like