0% found this document useful (0 votes)

85 views

2024 Build Llms

Uploaded by

chetannagar810

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views

2024 Build Llms

Uploaded by

chetannagar810

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 87

Developing an LLM:

Building, Training, Finetuning

Dataset with class labels

Classi er

Building an LLM Foundation model

Personal assistant

Instruction dataset
fi
Using Large Language Models (LLMs)

Sebastian Raschka Building LLMs 2

Using Large Language Models (LLMs)

1) Via public & proprietary services

Sebastian Raschka Building LLMs 3

Using Large Language Models (LLMs)

2) Running a (custom) LLM locally

https://github.com/Lightning-AI/litgpt

Sebastian Raschka Building LLMs 4

Using Large Language Models (LLMs)

3) Deploying a (custom) LLM

and using an LLM via a private API

https://lightning.ai/lightning-ai/studios/litgpt-serve

Sebastian Raschka Building LLMs 5

1) Via public & proprietary services

Di erent use cases &

2) Running a (custom) LLM locally trade-o s

(I use all of them)

3) Deploying a (custom) LLM

& using an LLM via a private API

Sebastian Raschka Building LLMs 6

ff
ff
What goes into developing an LLM like this?

Sebastian Raschka Building LLMs 7

Developing an LLM

STAGE 1: BUILDING

1) Data
2) Attention 3) LLM
preparation
mechanism architecture
& sampling

Building an LLM

Sebastian Raschka Building LLMs 8

Developing an LLM

STAGE 1: BUILDING STAGE 2: PRETRAINING

1) Data 7) Load
2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation
& sampling weights

4) Pretraining
Building an LLM Foundation model

Sebastian Raschka Building LLMs 9

Developing an LLM

STAGE 1: BUILDING STAGE 2: PRETRAINING STAGE 3: FINETUNING

1) Data 7) Load Dataset with class labels

2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation 8) Finetuning
& sampling weights

Classi er
4) Pretraining
Building an LLM Foundation model

Personal assistant
9) Finetuning

Instruction dataset

Sebastian Raschka Building LLMs 10

fi
https://mng.bz/M96o

https://github.com/rasbt/LLMs-from-scratch

(Most figure source)

Sebastian Raschka Building LLMs 11
Stage 1: Building

Sebastian Raschka Building LLMs 12

Let’s start with the dataset!

STAGE 1: BUILDING STAGE 2: PRETRAINING STAGE 3: FINETUNING

1) Data 7) Load Dataset with class labels

2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation 8) Finetuning
& sampling weights

Classi er
4) Pretraining
Building an LLM Foundation model

Personal assistant
9) Finetuning

Instruction dataset

Sebastian Raschka Building LLMs 13

fi
The model is simply (pre)trained
to predict the next word

Sebastian Raschka Building LLMs 14

Next word (/token) prediction

Sebastian Raschka Building LLMs 15

Text
sample: LLMs learn to predict one word at a time

Sebastian Raschka Building LLMs 16

Text
sample: LLMs learn to predict one word at a time

Input the LLM

Target to The LLM can’t access
receives
predict words past the target

Sebastian Raschka Building LLMs 17

Sample 1 LLMs learn to predict one word at a time

Sample 2 LLMs learn to predict one word at a time

Sebastian Raschka Building LLMs 18

Sample 1 LLMs learn to predict one word at a time

Sample 2 LLMs learn to predict one word at a time

Sample 3 LLMs learn to predict one word at a time

Sample 4 LLMs learn to predict one word at a time

Sample 5 LLMs learn to predict one word at a time

Sample 6 LLMs learn to predict one word at a time

Sample 7 LLMs learn to predict one word at a time

Sample 8 LLMs learn to predict one word at a time

Sebastian Raschka Building LLMs 19
Batching

Sample text
"In the heart of the city stood the old library, a relic from a bygone era. Its

stone walls bore the marks of time, and ivy clung tightly to its facade …"

x = tensor([[ "In", “the", "heart", "of" ],

Tensor [ "the", "city", "stood", "the" ],
containing [ "old", "library", ",", "a" ],
the inputs [ … ]])

Sebastian Raschka Building LLMs 20

Batching

Sample text
"In the heart of the city stood the old library, a relic from a bygone era. Its

stone walls bore the marks of time, and ivy clung tightly to its facade …"

x = tensor([[ "In", “the", "heart", "of" ],

Tensor [ "the", "city", "stood", "the" ],
containing [ "old", "library", ",", "a" ],
the inputs [ … ]])

Sebastian Raschka Building LLMs 21

Batching

Sample text
"In the heart of the city stood the old library, a relic from a bygone era. Its

stone walls bore the marks of time, and ivy clung tightly to its facade …"

x = tensor([[ "In", “the", "heart", "of" ],

Tensor [ "the", "city", "stood", "the" ],
containing [ "old", "library", ",", "a" ],
the inputs [ … ]])

Sebastian Raschka Building LLMs 22

Batching

Sample text
"In the heart of the city stood the old library, a relic from a bygone era. Its

stone walls bore the marks of time, and ivy clung tightly to its facade …"

x = tensor([[ "In", “the", "heart", "of" ],

Tensor [ "the", "city", "stood", "the" ],
containing [ "old", "library", ",", "a" ],
the inputs [ … ]])

(Common input lengths are >1024)

Sebastian Raschka Building LLMs 23

How do LLMs generate multi-word outputs?
Create the next
Iteration 1
word based on the
input text “This is”

Output layers

LLM

Preprocessing steps

Input text

“This”

Sebastian Raschka Building LLMs 24

How do LLMs generate multi-word outputs?
Create the next
Iteration 1 Iteration 2
word based on the
input text “This is” “This is an”

Output layers Output layers

LLM LLM

Preprocessing steps Preprocessing steps

Input text Input text

“This” “This is”

The output of the

previous round
serves as input to
the next round

Sebastian Raschka Building LLMs 25

How do LLMs generate multi-word outputs?
Create the next
Iteration 1 Iteration 2 Iteration 3
word based on the
input text “This is” “This is an” “This is an example”

Output layers Output layers Output layers

LLM LLM LLM

Preprocessing steps Preprocessing steps Preprocessing steps

Input text Input text Input text

“This” “This is” “This is an”

The output of the

previous round
serves as input to
the next round

Sebastian Raschka Building LLMs 26

There’s one more thing: tokenization

Sebastian Raschka Building LLMs 27

Sebastian Raschka Building LLMs 28
The GPT-3 dataset was 499 billion tokens

Quantity Weight in Epochs Elapsed when

Dataset (tokens) Training Mix Training for 300B Tokens
Common Crawl 410 billion 60% 0.44
( ltered)
WebText2 19 billion 22% 2.9

Books1 12 billion 8% 1.9

Books2 55 billion 8% 0.43

Wikipedia 3 billion 3% 3.4

Language Models are Few-Shot Learners (2020), https://arxiv.org/abs/2005.14165

Sebastian Raschka Building LLMs 29

fi
Llama 1 was trained on 1.4T tokens

LLaMA: Open and E cient Foundation Language Models (2023), https://arxiv.org/abs/2302.13971

Sebastian Raschka Building LLMs 30

ffi
Llama 2 was trained on 2T tokens

“Our training corpus includes a new mix of data from publicly available sources,
which does not include data from Meta’s products or services. We made an e ort
to remove data from certain sites known to contain a high volume of personal
information about private individuals. We trained on 2 trillion tokens of data as
this provides a good performance–cost trade-o , up-sampling the most factual
sources in an e ort to increase knowledge and dampen hallucinations.”

Llama 2: Open Foundation and Fine-Tuned Chat Models (2023), https://arxiv.org/abs/2307.09288

Sebastian Raschka Building LLMs 31

ff
ff
ff
Llama 3 was trained on 15T tokens

“To train the best language model, the curation of a large, high-
quality training dataset is paramount. In line with our design
principles, we invested heavily in pretraining data. Llama 3 is
pretrained on over 15T tokens that were all collected from publicly
available sources.”

Introducing Meta Llama 3: The most capable openly available LLM to date (2024), https://ai.meta.com/blog/meta-llama-3/

Sebastian Raschka Building LLMs 32

Quantity vs quality
“we mainly focus on the quality of data for a given scale. We try to
calibrate the training data to be closer to the “data optimal” regime
for small models. In particular, we lter the publicly available web
data to contain the correct level of “knowledge” and keep more web
pages that could potentially improve the “reasoning ability” for the
model. As an example, the result of a game in premier league in a
particular day might be good training data for frontier models, but we
need to remove such information to leave more model capacity for
“reasoning” for the mini size models.

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone (2024), https://arxiv.org/abs/2404.14219

Sebastian Raschka Building LLMs 33

fi
What goes into developing an LLM like this?

Sebastian Raschka Building LLMs 34

LLM architectures

Sebastian Raschka Building LLMs 35

Implementing the architecture

STAGE 1: BUILDING STAGE 2: PRETRAINING STAGE 3: FINETUNING

1) Data 7) Load Dataset with class labels

2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation 8) Finetuning
& sampling weights

Classi er
4) Pretraining
Building an LLM Foundation model

Personal assistant
9) Finetuning

Instruction dataset

Sebastian Raschka Building LLMs 36

fi
The original GPT GPT
model
Linear output layer

Final LayerNorm

model +
Dropout
Linear layer

Feed forward GELU activation

LayerNorm 2
Linear layer

+
Dropout

A view into the “Feed

Masked multi-head
Repeat this transformer attention forward” block

block N times LayerNorm 1

N× { Dropout

Positional embedding layer

Token embedding layer

Tokenized text

Every effort moves you

Sebastian Raschka Building LLMs 37

Linear output layer
GPT
model Final LayerNorm
Total number of parameters:
• 124 M in "gpt2-small"
• 355 M in "gpt2-medium" +
• 774 M in "gpt2-large" Dropout
• 1558 M in "gpt2-xl"
Feed forward

LayerNorm 2

+ Number of heads in multi-head attention:

Repeat this transformer block: Dropout • 12 in "gpt2-small"
• 12 × in "gpt2-small" • 16 in "gpt2-medium"
• 24 × in "gpt2-medium" Masked multi-head
attention
• 20 in "gpt2-large"
• 36 × in "gpt2-large" • 25 in "gpt2-xl"
• 48 × in "gpt2-xl" LayerNorm 1

N× { Dropout

Positional embedding layer

Embedding dimensions:

Token embedding layer

• 768 in "gpt2-small"
• 1024 in "gpt2-medium"
• 1280 in "gpt2-large"
Tokenized text
• 1600 in "gpt2-xl"

Every effort moves you

Sebastian Raschka Building LLMs 38

GPT-2 “large” Llama 2 7B
Linear output layer Linear output layer

Final LayerNorm Final LayerNorm

Linear layer Linear layer
RMS
SILU
+ + GELU activation
GELU activation
Dropout Dropout

Linear layer Linear layer

Feed forward Feed forward

RMS
LayerNorm 2 LayerNorm 2

+ 20 heads + 32 heads

Dropout Dropout

Masked multi-head Masked multi-head

attention attention

RMS
LayerNorm 1 LayerNorm 1

36 × { Dropout
32 × { Dropout

Positional embedding layer Positional embedding layer

Absolute positional RoPE embeddings with 4048
embeddings with 1280 tokens tokens
Token embedding layer Token embedding layer

Tokenized text Tokenized text

Every effort moves you Every effort moves you

Sebastian Raschka Building LLMs 39
Stage 2: Pretraining

Sebastian Raschka Building LLMs 40

STAGE 1: BUILDING STAGE 2: PRETRAINING STAGE 3: FINETUNING

1) Data 7) Load Dataset with class labels

2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation 8) Finetuning
& sampling weights

Classi er
4) Pretraining
Building an LLM Foundation model

Personal assistant
9) Finetuning

Instruction dataset

Sebastian Raschka Building LLMs 41

fi
Pretty standard deep learning training loop

Sebastian Raschka Building LLMs 42

Labels are the inputs shifted by +1
Sample text
"In the heart of the city stood the old library, a relic from a bygone era. Its

stone walls bore the marks of time, and ivy clung tightly to its facade …"

x = tensor([[ "In", “the", "heart", "of" ],

Tensor [ "the", "city", "stood", "the" ],
containing [ "old", "library", ",", "a" ],
the inputs [ … ]])

y = tensor([[ "the", "heart", "of", "the" ],

Tensor [ "city", "stood", "the", "old" ],
containing [ “library", "a", "relic", "from" ],
the targets [ … ]])

Sebastian Raschka Building LLMs 43

Training for ~1-2 epochs is usually a good sweet spot

Sebastian Raschka Building LLMs 44

https://lightning.ai/lightning-ai/studios/pretrain-llms-tinyllama-1-1b

Sebastian Raschka Building LLMs 45

STAGE 1: BUILDING STAGE 2: PRETRAINING STAGE 3: FINETUNING

1) Data 7) Load Dataset with class labels

2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation 8) Finetuning
& sampling weights

Classi er
4) Pretraining
Building an LLM Foundation model

Personal assistant
9) Finetuning

Instruction dataset

Sebastian Raschka Building LLMs 46

fi
Loading pretrained weights

https://github.com/Lightning-AI/litgpt
Sebastian Raschka Building LLMs 47
LitGPT

https://github.com/Lightning-AI/litgpt

Sebastian Raschka Building LLMs 48

Stage 3: Finetuning

Sebastian Raschka Building LLMs 49

STAGE 1: BUILDING STAGE 2: PRETRAINING STAGE 3: FINETUNING

1) Data 7) Load Dataset with class labels

2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation 8) Finetuning
& sampling weights

Classi er
4) Pretraining
Building an LLM Foundation model

Personal assistant
9) Finetuning

Instruction dataset

Sebastian Raschka Building LLMs 50

fi
Sebastian Raschka Building LLMs 51
Replace Outputs

output layer
1 50,257
Linear output layer …
GPT
model Final LayerNorm

+ …
Dropout 1 768

Feed forward
The original linear output layer
LayerNorm 2
maps 768 hidden units to 50,257 units
+
Dropout
(the number of tokens in the vocabulary)
Masked multi-head
attention

LayerNorm 1

12 × { Dropout

Positional embedding layer

Token embedding layer

Tokenized text

Inputs

Sebastian Raschka Building LLMs 52

Replace Outputs

output layer
Linear output layer
GPT
model Final LayerNorm
1 50,257
…
+
Dropout

Feed forward
…
1 768
LayerNorm 2

+
1 2
Dropout

Masked multi-head
attention

LayerNorm 1

12 × { Dropout
1
…
768
We replace the original linear output layer above
Positional embedding layer
with a layer that maps from 768 hidden units to
only 2 units, where the 2 units represent the two
Token embedding layer classes ("spam" and "not spam")

Tokenized text

Inputs

Sebastian Raschka Building LLMs 53

Track loss values as usual

Sebastian Raschka Building LLMs 54

In addition, look at task performance

Sebastian Raschka Building LLMs 55

We don’t need to finetune all layers

last layer only all layers

https://magazine.sebastianraschka.com/p/ netuning-large-language-models

Sebastian Raschka Building LLMs 56

fi
Training more layers takes more time

https://magazine.sebastianraschka.com/p/ netuning-large-language-models

Sebastian Raschka Building LLMs 57

fi
Instruction finetuning

STAGE 1: BUILDING STAGE 2: PRETRAINING STAGE 3: FINETUNING

1) Data 7) Load Dataset with class labels

2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation 8) Finetuning
& sampling weights

Classi er
4) Pretraining
Building an LLM Foundation model

Personal assistant
9) Finetuning

Instruction dataset

Sebastian Raschka Building LLMs 58

fi
Instruction finetuning datasets

{
"instruction": "Rewrite the following sentence using passive voice.",
"input": "The team achieved great results.",
"output": "Great results were achieved by the team."
},

Sebastian Raschka Building LLMs 59

{
"instruction": "Rewrite the following sentence using passive voice.",
"input": "The team achieved great results.",
"output": "Great results were achieved by the team."
},

Apply prompt style template (for example, Alpaca-style)

Below is an instruction that describes a task. Write a response

that appropriately completes the request.

### Instruction:
Rewrite the following sentence using passive voice.

### Input:
The team achieved great results.

### Response:
Great results were achieved by the team.

Pass to LLM for supervised instruction netuning

LLM

Sebastian Raschka Building LLMs 60

fi
Model input

Below is an instruction that describes a task. Write a response

that appropriately completes the request.

### Instruction:
Rewrite the following sentence using passive voice.

### Input:
The team achieved great results.

### Response:
Great results were achieved by the team.

Model response

Sebastian Raschka Building LLMs 61

Bonus: Preference tuning

Sebastian Raschka Building LLMs 62

Input Prompt:
"What are the key features to look for when purchasing a new laptop?"

Sebastian Raschka Building LLMs 63

Input Prompt:
"What are the key features to look for when purchasing a new laptop?"

Answer 1: Technical Response

"When purchasing a new laptop, focus on key
speci cations such as the processor speed,
RAM size, storage type (SSD vs. HDD), and
battery life. The processor should be powerful
enough for your software needs, and suf cient
RAM will ensure smooth multitasking. Opt for
an SSD for faster boot times and le access.
Additionally, screen resolution and port types
are important for connectivity and display
quality."

Sebastian Raschka Building LLMs 64

fi
fi
fi
Input Prompt:
"What are the key features to look for when purchasing a new laptop?"

Answer 1: Technical Response Answer 2: User-Friendly Response

"When purchasing a new laptop, focus on key "When looking for a new laptop, think about
speci cations such as the processor speed, how it ts into your daily life. Choose a
RAM size, storage type (SSD vs. HDD), and lightweight model if you travel frequently, and
battery life. The processor should be powerful consider a laptop with a comfortable keyboard
enough for your software needs, and suf cient and a responsive touchpad. Battery life is
RAM will ensure smooth multitasking. Opt for crucial if you're often on the move, so look for
an SSD for faster boot times and le access. a model that can last a full day on a single
Additionally, screen resolution and port types charge. Also, make sure it has enough USB
are important for connectivity and display ports and possibly an HDMI port to connect
quality." with other devices easily."

Sebastian Raschka Building LLMs 65

fi
fi
fi
fi
https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives

Sebastian Raschka Building LLMs 66

Evaluating LLMs

Sebastian Raschka Building LLMs 67

MMLU and others

MMLU
Rank Model Average↑ (%) Paper
1 Gemini Ultra 90 Gemini: A Family of Highly Capable
~1760B Multimodal Models
2 GPT-4o 88.7 GPT-4 Technical Report

3 Claude 3 Opus (5- 88.2 The Claude 3 Model Family: Opus,

shot, CoT) Sonnet, Haiku
4 Claude 3 Opus (5- 86.8 The Claude 3 Model Family: Opus,
shot) Sonnet, Haiku
5 Leeroo (5-shot) 86.64 Leeroo Orchestrator: Elevating LLMs
Performance Through Model
6 GPT-4 (few-shot) 86.4 Integration
GPT-4 Technical Report

7 Gemini Ultra (5- 83.7 Gemini: A Family of Highly Capable

shot) Multimodal Models
8 Claude 3 Sonnet 81.5 The Claude 3 Model Family: Opus,
(5-shot, CoT) Sonnet, Haiku

Sebastian Raschka Building LLMs 68

MMLU
MMLU = Measuring Massive Multitask Language Understanding (2020), https://arxiv.org/abs/2009.03300
Multiple-choice questions from diverse subjects

input = ("Which character is known for saying,

'To be, or not to be, that is the question'?
Options:
A) Macbeth, B) Othello,
C) Hamlet, D) King Lear.”)

model_answer = model(input)

correct_answer = "C) Hamlet”

score += model_answer == correct_answer

# total_score = score / num_examples * 100%

Sebastian Raschka Building LLMs 69

LM Evaluation Harness

https://github.com/EleutherAI/lm-evaluation-harness

https://github.com/Lightning-AI/litgpt/blob/main/tutorials/evaluation.md
Sebastian Raschka Building LLMs 70
AlpacaEval
Compare to response by GPT-4 Preview using a GPT-4 based auto-annotator

Sebastian Raschka Screenshot from https://tatsu-lab.github.io/alpaca_eval/ Building LLMs 71

LMSYS ChatBot Arena
LLM community comparison

Screenshots from https://chat.lmsys.org/

Sebastian Raschka Building LLMs 72

GPT-4 scoring

https://github.com/rasbt/LLMs-from-scratch/blob/main/ch07/03_model-evaluation/llm-instruction-eval-openai.ipynb

Sebastian Raschka Building LLMs 73

Rules of thumb

Sebastian Raschka Building LLMs 74

Rules of thumb

Pretraining from scratch Expensive, almost never necessary

Sebastian Raschka Building LLMs 75

Rules of thumb

Pretraining from scratch Expensive, almost never necessary

Continued pretraining Add new knowledge

Sebastian Raschka Building LLMs 76

Rules of thumb

Pretraining from scratch Expensive, almost never necessary

Continued pretraining Add new knowledge

Finetuning Special usecase, follow instructions

Sebastian Raschka Building LLMs 77

Rules of thumb

Pretraining from scratch Expensive, almost never necessary

Continued pretraining Add new knowledge

Finetuning Special usecase, follow instructions

Improve helpfulness+safety if
Preference netuning
developing a chatbot

Sebastian Raschka Building LLMs 78

fi
CodeLlama example

Pretraining (from scratch)

Code Llama: Open Foundation Models for Code, https://arxiv.org/abs/2308.12950

Sebastian Raschka Building LLMs 79

CodeLlama example

Pretraining (from scratch) Continued pretraining

Code Llama: Open Foundation Models for Code, https://arxiv.org/abs/2308.12950

Sebastian Raschka Building LLMs 80

CodeLlama example
Continued pretraining / netuning
Pretraining (from scratch) Continued pretraining

Code Llama: Open Foundation Models for Code, https://arxiv.org/abs/2308.12950

Continued pretraining

Sebastian Raschka Building LLMs 81

fi
CodeLlama example
Continued pretraining / netuning
Pretraining (from scratch) Continued pretraining

Code Llama: Open Foundation Models for Code, https://arxiv.org/abs/2308.12950 Instruction netuning
Continued pretraining

Sebastian Raschka Building LLMs 82

fi
fi
Developing an LLM

STAGE 1: BUILDING STAGE 2: PRETRAINING STAGE 3: FINETUNING

1) Data 7) Load Dataset with class labels

2) Attention 3) LLM 5) Training 6) Model
preparation pretrained
mechanism architecture loop evaluation 8) Finetuning
& sampling weights

Classi er
4) Pretraining
Building an LLM Foundation model

Personal assistant
9) Finetuning

Instruction dataset

Sebastian Raschka Building LLMs 83

fi
https://mng.bz/M96o

https://sebastianraschka.com/books/

Sebastian Raschka Building LLMs 84

https://lightning.ai/
Sebastian Raschka Building LLMs 85
Sebastian Raschka Building LLMs 86
Contact
@rasbt in/sebastianraschka
https://sebastianraschka.com/contact/

https://lightning.ai

Slides
🗺 https://sebastianraschka.com/pdf/slides/2024-build-llms.pdf

Sebastian Raschka Building LLMs 87

Instant ebooks textbook Build a Large Language Model (From Scratch) (MEAP V01) Sebastian Raschka download all chapters
100% (3)
Instant ebooks textbook Build a Large Language Model (From Scratch) (MEAP V01) Sebastian Raschka download all chapters
34 pages
Sinan Ozdemir - Quick Start Guide to Large Language Models, Second Edition-Addison-Wesley (2024)
No ratings yet
Sinan Ozdemir - Quick Start Guide to Large Language Models, Second Edition-Addison-Wesley (2024)
279 pages
LLMs in Production-MLC - GRC
No ratings yet
LLMs in Production-MLC - GRC
39 pages
Prompt Engineering Cheat Sheet
No ratings yet
Prompt Engineering Cheat Sheet
3 pages
Demystifying LLMs
No ratings yet
Demystifying LLMs
5 pages
2023 LLMBC App in Hour
No ratings yet
2023 LLMBC App in Hour
39 pages
TACN-VD-1-4
No ratings yet
TACN-VD-1-4
6 pages
LLM Intro
No ratings yet
LLM Intro
8 pages
Pieces DZ RC 393 Getting Started Llms 2024
No ratings yet
Pieces DZ RC 393 Getting Started Llms 2024
8 pages
LLM_Project_Guide
No ratings yet
LLM_Project_Guide
4 pages
Vectorstores
No ratings yet
Vectorstores
11 pages
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
100% (1)
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
48 pages
Creación de aplicaciones LLM modelos de lenguaje…
No ratings yet
Creación de aplicaciones LLM modelos de lenguaje…
5 pages
Fine Tuning Techniques for Large Language Models LLMs
No ratings yet
Fine Tuning Techniques for Large Language Models LLMs
15 pages
_OceanofPDF.com_LLMs_in_Enterprise_-_Ahmed_Menshawy
No ratings yet
_OceanofPDF.com_LLMs_in_Enterprise_-_Ahmed_Menshawy
194 pages
Large Language Models Johns Hopkins University
No ratings yet
Large Language Models Johns Hopkins University
54 pages
Leveraging Language Models With RAG
No ratings yet
Leveraging Language Models With RAG
57 pages
ML A Deep Dive in The World of AI and LLM Tun'Up Munich - 241021 - 130023
No ratings yet
ML A Deep Dive in The World of AI and LLM Tun'Up Munich - 241021 - 130023
34 pages
The Busy Person Intro To LLMs. Covering All The Major Updates in The - by Vishal Rajput - AIGuys - Dec, 2023 - Medium
No ratings yet
The Busy Person Intro To LLMs. Covering All The Major Updates in The - by Vishal Rajput - AIGuys - Dec, 2023 - Medium
1 page
Large Language Model (LLM) 1
100% (1)
Large Language Model (LLM) 1
17 pages
Aryan A. What Is LLMOps. Large Language Models in Production 2024
No ratings yet
Aryan A. What Is LLMOps. Large Language Models in Production 2024
67 pages
DZ-getting-started-large Language Models LLMs-2024
No ratings yet
DZ-getting-started-large Language Models LLMs-2024
7 pages
Large Language Models Use Cases
No ratings yet
Large Language Models Use Cases
15 pages
What We’ve Learned From A Year of Building with LLMs – Applied LLMs
No ratings yet
What We’ve Learned From A Year of Building with LLMs – Applied LLMs
37 pages
4-HC24.PrimisAI.Hans_Bouwmeester.v4
No ratings yet
4-HC24.PrimisAI.Hans_Bouwmeester.v4
29 pages
An Overview of Large Language Models for Statisticians
No ratings yet
An Overview of Large Language Models for Statisticians
67 pages
Toc 9780138199302
No ratings yet
Toc 9780138199302
8 pages
The Best LLMs Cheatsheet - Part 1
No ratings yet
The Best LLMs Cheatsheet - Part 1
16 pages
Build Large Language Models from Scratch - Analytics Vidhya
No ratings yet
Build Large Language Models from Scratch - Analytics Vidhya
48 pages
1. LLMs for Me - Introduction LLMs & Generative Text
No ratings yet
1. LLMs for Me - Introduction LLMs & Generative Text
38 pages
2_notes (3)
No ratings yet
2_notes (3)
3 pages
Large Language Models and Their Use Cases
No ratings yet
Large Language Models and Their Use Cases
3 pages
Data Seminar
No ratings yet
Data Seminar
10 pages
LLM
100% (1)
LLM
10 pages
Python BAKMR010399001
No ratings yet
Python BAKMR010399001
3 pages
LLM Model
No ratings yet
LLM Model
43 pages
Lecture 16
No ratings yet
Lecture 16
39 pages
aa
No ratings yet
aa
11 pages
2024 NTU - Resaro - LLM - Security - Paper
No ratings yet
2024 NTU - Resaro - LLM - Security - Paper
19 pages
SSRN Id4655822
No ratings yet
SSRN Id4655822
9 pages
A Comprehensive Overview of Large Language Models: Preprint 1
No ratings yet
A Comprehensive Overview of Large Language Models: Preprint 1
46 pages
Kickstart Your Journey with LLM_ A Comprehensive Guide
No ratings yet
Kickstart Your Journey with LLM_ A Comprehensive Guide
2 pages
LLM 13 Use Cases
No ratings yet
LLM 13 Use Cases
15 pages
Week4 LLMs EN
No ratings yet
Week4 LLMs EN
48 pages
LLM Application Through Production
100% (11)
LLM Application Through Production
254 pages
Llm Application Through Production
No ratings yet
Llm Application Through Production
254 pages
Intro to Large Language Models
No ratings yet
Intro to Large Language Models
45 pages
icaps-llm-tut-slides-posted
No ratings yet
icaps-llm-tut-slides-posted
97 pages
LLM From Scratch
No ratings yet
LLM From Scratch
27 pages
LLM Presentation
No ratings yet
LLM Presentation
10 pages
GPT Prompt Maestro by David Shapiro
No ratings yet
GPT Prompt Maestro by David Shapiro
16 pages
1. Application Of Large Language
No ratings yet
1. Application Of Large Language
75 pages
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
No ratings yet
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
325 pages
SESSION_1_LLMs
No ratings yet
SESSION_1_LLMs
40 pages
What We Learned From A Year of Building With LLMs (Part I) - O'Reilly
No ratings yet
What We Learned From A Year of Building With LLMs (Part I) - O'Reilly
22 pages
《A Primer on Large Language Models and their Limitations
No ratings yet
《A Primer on Large Language Models and their Limitations
33 pages
LLM
No ratings yet
LLM
3 pages
DAB311 DL Week 11 RNN
No ratings yet
DAB311 DL Week 11 RNN
25 pages
Large Language Models A Comprehensive Survey of It
No ratings yet
Large Language Models A Comprehensive Survey of It
30 pages
Notes 4 Large Language Model
No ratings yet
Notes 4 Large Language Model
4 pages
Java Performance Optimization: Expert Strategies for Enhancing JVM Efficiency
From Everand
Java Performance Optimization: Expert Strategies for Enhancing JVM Efficiency
Adam Jones
No ratings yet
LLMs BillBoard 2024
No ratings yet
LLMs BillBoard 2024
145 pages
人工智能前沿专题大语言模型基础导论研究生课程 Honggang Zhang 2025(1)
No ratings yet
人工智能前沿专题大语言模型基础导论研究生课程 Honggang Zhang 2025(1)
233 pages
Text To Image Vllms
No ratings yet
Text To Image Vllms
2 pages
LLM and Generative AI Report - SDAIA
No ratings yet
LLM and Generative AI Report - SDAIA
23 pages
NLP-LLM
No ratings yet
NLP-LLM
47 pages
KAG Graph + Multimodal RAG + LLM Agents = Powerful AI Reasoning _ by Gao Dalie (高達烈) _ in Towards AI - Freedium
No ratings yet
KAG Graph + Multimodal RAG + LLM Agents = Powerful AI Reasoning _ by Gao Dalie (高達烈) _ in Towards AI - Freedium
13 pages
BLIVA_ A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
No ratings yet
BLIVA_ A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
12 pages
RAG-HAT - A Hallucination-Aware Tuning Pipeline For LLM in Retrieval-Augmented Generation
No ratings yet
RAG-HAT - A Hallucination-Aware Tuning Pipeline For LLM in Retrieval-Augmented Generation
11 pages
Gpt4all Paper
No ratings yet
Gpt4all Paper
6 pages
Black-Box Prompt Optimization
No ratings yet
Black-Box Prompt Optimization
20 pages
ChatGPT - Chatbot With GPT 30
No ratings yet
ChatGPT - Chatbot With GPT 30
1 page
Slides For 'Large Language Model: From Theory To Implementations', Chapter 3
No ratings yet
Slides For 'Large Language Model: From Theory To Implementations', Chapter 3
67 pages
Scaling Laws for LLMs_ From GPT-3 to o3
No ratings yet
Scaling Laws for LLMs_ From GPT-3 to o3
35 pages
How to Build LLMs From Scratch
No ratings yet
How to Build LLMs From Scratch
7 pages
NLP Programming en 01 Unigramlm
No ratings yet
NLP Programming en 01 Unigramlm
28 pages
Introduction_to_LLMs
No ratings yet
Introduction_to_LLMs
2 pages
Vectpr
No ratings yet
Vectpr
7 pages
Chunking & Tokenization (Updated)
No ratings yet
Chunking & Tokenization (Updated)
25 pages
5 Pretraining On Unlabeled Data - Build A Large Language Model (From Scratch)
No ratings yet
5 Pretraining On Unlabeled Data - Build A Large Language Model (From Scratch)
61 pages
Smooth Quant
No ratings yet
Smooth Quant
21 pages
Multilingual Machine Translation With Large Language Models
No ratings yet
Multilingual Machine Translation With Large Language Models
16 pages
LLM Paper 4
No ratings yet
LLM Paper 4
24 pages
第4章参数高效微调
No ratings yet
第4章参数高效微调
33 pages
[FR] Liste Ultime Pour Construire Une Application IA Open Source
No ratings yet
[FR] Liste Ultime Pour Construire Une Application IA Open Source
30 pages
Slides For 'Large Language Model: From Theory To Implementations', Chapter 1
No ratings yet
Slides For 'Large Language Model: From Theory To Implementations', Chapter 1
40 pages
Slides For 'Large Language Model: From Theory To Implementations', Chapter 5
No ratings yet
Slides For 'Large Language Model: From Theory To Implementations', Chapter 5
71 pages
2024 Build Llms
No ratings yet
2024 Build Llms
87 pages
Jason Weston Reasoning Alignment Berkeley Talk
No ratings yet
Jason Weston Reasoning Alignment Berkeley Talk
106 pages
Black-Box Analysis: Gpts Across Time in Legal Textual Entailment Task
No ratings yet
Black-Box Analysis: Gpts Across Time in Legal Textual Entailment Task
6 pages