0% found this document useful (0 votes)

26 views13 pages

The Llama Hitchiking Guide To Local LLMs - Hackerllama

The document is a comprehensive guide to local Large Language Models (LLMs) and key terminology related to them, including definitions of LocalLlama, LLM, transformer, and various training techniques like pre-training and fine-tuning. It also discusses different models, their architectures, and methods for evaluating and merging them, alongside tools for running models locally. Additionally, it highlights community resources and projects aimed at enhancing accessibility and usability of LLMs.

Uploaded by

Minh Tu Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views13 pages

The Llama Hitchiking Guide To Local LLMs - Hackerllama

Uploaded by

Minh Tu Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

hackerllama

The Llama Hitchiking Guide to Local LLMs

PUBLISHED
January 12, 2024

Here are some terms that are useful to know when joining the Local LLM community.

1. LocalLlama: A Reddit community of practitioners, researchers, and hackers doing all kinds of crazy
things with ML models.

2. LLM: A Large Language Model. Usually a transformer-based model with a lot of parameters…billions
or even trillions.

3. Transformer: A type of neural network architecture that is very good at language tasks. It is the basis
for most LLMs.

4. GPT: A type of transformer that is trained to predict the next token in a sentence. GPT-3 is an example
of a GPT model…who could tell??

4.1 Auto-regressive: A type of model that generates text one token at a time. It is auto-regressive
because it uses its own predictions to generate the next token. For example, the model might receive
as input “Today’s weather” and generate the next token, “is”. It will then use “Today’s weather is” as
input and generate the next token, “sunny”. It will then use “Today’s weather is sunny” as input and
generate the next token, “and”. And so on.

5. Token: Models don’t understand words. They understand numbers. When we receive a sequence of
words, we convert them to numbers. Sometimes we split words into pieces, such as “tokenization” into
“token” and “ization”. This is needed because the model has a limited vocabulary. A token is the
smallest unit of language that a model can understand.
6. Context length: The number of tokens that the model can use at a time. The higher the context length,
the more memory the model needs to train and the slower it is to run. E.g. Llama 2 can manage up to
4096 tokens.

6.1 LLaMA: A pre-trained model trained by Meta, shared with some groups in a private access, and
then leaked. It led to an explosion of cool projects. 🦙
6.2 Llama 2: An open-access pre-trained model released by Meta. It led to another explosion of very
cool projects, and this one was not leaked! The license is not technically open-source but it’s still quite
open and permissive, even for commercial use cases. 🦙🦙
6.3 RoPE: A technique that allows you to significantly expand the context lengths of a model.

6.4 SuperHot: A technique that allows expanding the context length of RoPE-based models even
more by doing some minimal additional training.

7. Pre-training: Training a model on a very large dataset (trillion of tokens) to learn the structure of
language. Imagine you have millions of dollars, as a good GPU-Rich. You usually scrape big datasets
from the internet and train your model on them. This is called pre-training. The idea is to end with a
model that has a strong understanding of language. This does not require labeled data! This is done
before fine-tuning. Examples of pre-trained models are GPT-3, Llama 2, and Mistral.

7.1 Mistral 7B: A pre-trained model trained by Mistral. Released via torrent.

7.2 Phi 2: A pre-trained model by Microsoft. It only has 2.7B parametrs but it’s quite good for its size!
It was trained with very little data (textbooks) which shows the power of high-quality data.

7.3 transformers: a Python library to access models shared by the community. It allows you to
download pre-trained models and fine-tune them for your own needs

7.4 Base vs conversational: a pre-trained model is not specifically trained to “behave” in a

conversational manner. If you try to use a base model (e.g. GPT-3, Mistral, Llama) directly to do
conversations, it won’t work as well as the fine-tuned conversational variant (ChatGPT, Mistral
Instruct, Llama Chat). When looking at benchmarks, you want to compare base models with base
models and conversational models with conversational models.

8. Fine-tuning: Training a model on a small (labeled) dataset to learn a specific task. This is done after
pre-training. Imagine you have a few dollars, as a good fellow GPU-Poor. Rather than training a model
from scratch, you pick a pre-trained (base) model and fine-tune it. You usually pick a small dataset of
few hundreds-thousands of samples. You then pass it to the model and train it on it. This is called fine-
tuning. The idea is to end with a model that has a strong understanding of a specific task. For example,
you can fine-tune a model with your tweets to make it generate tweets like you! (but please don’t).
You can fine-tune many models in your gaming laptop! Examples of fine-tuned models are ChatGPT,
Vicuna, and Mistral Instruct.

8.1 Mistral 7B Instruct: A fine-tuned version of Mistral 7B.

8.2 Vicuna: A cute animal that is also a fine-tuned model. It begins from LLaMA-13B and is fine-tuned
on user conversations with ChatGPT.

8.3 Number of parameters: Notice the -13B in point 8.2. That’s the number of parameters in a model.
Each parameter is a number (with certain precision), and is part of the model. The parameters are
learned during pre-training and fine-tuning to minimize the error.

9. Prompt: A few words that you give to the model to start generating text. For example, if you want to
generate a poem, you can give the model the first line of the poem as a prompt. The model will then
generate the rest of the poem!

10. Zero-shot: A type of prompt that is used to generate text without fine-tuning. The model is not
trained on any specific task. It is only trained on a large dataset of text. For example, you can give the
model the first line of a poem and ask it to generate the rest of the poem. The model will do its best to
generate a poem, even though it has never seen a poem before! When you use ChatGPT, you often do
zero-shot generation!

User: Write a poem about a llama

_______________
Model:
Graceful llama, in Andean air,
Elegant stride, woolly flair.
Mountains echo, mystic charm,
Llama's gaze, a tranquil balm.
11. Few-shot: A type of prompt that is used to generate text with fine-tuning. We provide a couple of
examples to the model. This can improve the quality a lot!

User
Input:

Text: "The cat sat on the mat."

Label: Sentence about an animal.

Text: "The sun is incredibly bright today."

Label: Sentence about weather.

Classification Task:
Classify the following text - "Rainy days make me want to stay in bed."

Output:
Label: Sentence about weather.

Text: "Rainy days make me want to stay in bed."

__________________
Model
Label: Sentence about weather.

12. Instruct-tuning: A type of fine-tuning that uses instructions to generate text ending in more
controlled behavor in generating responses or performing tasks.

12.1 Alpaca: A dataset of 52,000 instructions generatd with OpenAI APIs. It kicked off a big wave of
people using OpenAI to generate synthetic data for instruct-tuning. It costed about $500 to generate.

12.2 LIMA: A model that demonstrates strong performance with very few examples. It demonstrates
that adding more data does not always correlate with better quality.

13. RLHF (Reinforcement Learning with Human Feedback): A type of fine-tuning that uses
reinforcement learning (RL) and human-generated feedback. Thanks to the introduction of human
feedback, the end model ends up being very good for things such as conversations! It kicks off with a
base model that generates bunch of conversations. Humans then rate the answers (preferences). The
preferences are used to train a Reward Model that generates a score for a given text. Using
Reinforcement Learning, the initial LM is trained to maximize the score generated by the Reward
Model. Read more about it here.

13.1 RL: Reinforcement learning is a type of machine learning that uses rewards to train a model. For
example, you can train a model to play a game by giving it a reward when it wins and a punishment
when it loses. The model will learn to win the game!

13.2. Reward Model: A model that is used to generate rewards. For example, you can train a model to
generate rewards for a game. The model will learn to generate rewards that are good for the game!

13.3 ChatGPT: RLHF-finetuned GPT-3 model that is very good at conversations.

13.4 AIF: An alternative to human feedback…AI Feedback!

14. PPO: A type of reinforcement learning algorithm that is used to train a model. It is used in RLHF.

15. DPO: A type of training which removes the need for a reward model. It simplifies significantly the
RLHF-pipeline.

15.1 Zephyr: A 7B Mistral-based model trained with DPO. It has similar capabilities to the Llama 2
Chat model of 70B parameters. It came out with a nice handbook of recipes.

15.2 Notus: A trained variation of Zephyr but with better filered and fixed data. It does better!

15.3 Overfitting: occurs in ML when a model learns the training data too well, capturing noise and
specific patterns that do not generalize to new, unseen data, leading to poor performance on real-
world tasks.

15.4 DPO Overfits Although DPO shows overfitting behaviors after one behavior, it does not harm
downstream performance on chat evaluations. Did your ML teachers lie to us when they said
overfitting was bad?

15.5 IPO: A change in the DPO objective which is simpler and less prone to overfitting.

15.6. KTO: While PPO, DPO, and IPO require pairs of accepted vs rejected generations, KTO just
needs a binary label (accepted or rejected), hence allowing to scale to much more data.

15.7 trl: A library that allows to train models with DPO, IPO, KTO, and more!

16. Open LLM Leaderboard: A leaderboard where you can find benchmark results for many open-access
LLMs.

17.1 Benchmark: A benchmark is a test that you run to compare different models. For example, you
can run a benchmark to compare the performance of different models on a specific task.

17.2 TruthfulQA: A not-great benchmark to measure a model’s ability to generate truthful answers.

17.3 Conversational models: The LLM Leaderboard should be mostly to compare base models, not as
much for conversational models. It still provides some useful signal about the conversational models,
but this should not be the final way to evaluate them.
17. Chatbot Arena: A popopular crowd-sourced open benchmark of human preferences. It’s good to
compare conversational models

18. MT-Bench: A multi-turn benchmark of 160 questions across eight domains. Each response is
evaluated by GPT-4. (This presents limitations…what happens if the model is better than GPT-4?)

19. Mixture-of-Experts (MoE): A model architecture in which some of the (dense) layers are replaced
with a set of experts. Each expert is a small neural network. There is a small network, router, that
decides which expert to use for each token (read more here). Clarifications:

A MoE is not an ensemble.

If we say a MoE has 8 experts, it means each replaced dense layer is replaced with 8 experts. If
there were 3 replaced layers, then there are 24 experts in total!
We can activate multiple experts at the same time. For a given sentence, “hello world”, “hello
might be sent to experts 1 and 2 while”world” to 2 and 4.
The experts in a MoE do not specialize in a task. They are all trained on the same task, they just
get different tokens! Sometimes they do specialize in certain types of tokens, as shown in this
table from the ST-MoE paper.
19.1 GPT-4: A kinda good model, but we don’t know what it is. The rumors say it’s a MoE.

19.2 Mixtral: A MoE model released by Mistral. It has 47B parameters but only 12B parameters are
used at a time, making it very efficient.
20. Model Merging: A technique that allows us to combine multiple models of the same architecture into
a single model. Read more here.

20.1 Mergekit: A cool open-source tool to quickly merge repos.

20.2 Averaging: The most basic merging technique. Pick two models, average their weights. Somehow
it kinda works!

20.3 Frankenmerge: It allows to concatenate layers from different LLMs, allowing you to do crazy
things.

20.4 Goliath-120B: A frankenmerge that combines two Llama 70B models to achieve a 120B model

20.5 MoE Merging: (Not 100% about this one) Experimental branch in mergekit that allows building
a MoE-like model combining different models. You specify which models and which types of prompts
you want each expert to handle, hence ending with expert task-specialization.

20.6 Phixtral: A MoE merge of Phi 2 DPO and Dolphin 2 Phi 2.

21. Local LLMs: If we have models small enough, we can run them in our computers or even our phones!

21.1 TinyLlama: A project to pre-train a 1.1B Llama model on 3 trillion tokens.

21.2 Cognitive Computations: A community (led by Eric Hartford) that is fine-tuning a bunch of
models

21.3 Uncensored models: Many models have some strong alignment that prevent doing things such
as asking Llama to kill a Linux process. Training uncensored models aims to remove specific biases
engrained in the decision-making process of fine-tuning a model. Read more here.

21.4 llama.cpp: A tool to use Llama-like models in C++.

21.5 GGUF: A format introduced by llama.cpp to store models. It replaces the old file format, GGML.

21.6 ggml: Tensor library in ML, allowing projects such as llama.cpp and whisper.cpp (not the same as
GGML, the file format).

21.7 Georgi Gerganov: The creator of llama.cpp and ggml!

21.8 Whisper: The state-of-the-art speech-to-text open source model.

21.9 OpenAI: A company that does closed source AI. (kidding, they open-sourced Whisper!)

21.10 MLX: A new framework for Apple devices that allows easy inference and fine-tuning of models.
22. A. Local LLM tools: If you don’t know how to code, there are a couple of tools that can be useful

22.1 Oobabooga: A simple web app that allows you to use models without coding. It’s very easy to
use!

22.2 LM Studio: A nice advanced app that runs models on your laptop, entirely offline.

22.3 ollama: An open-source tool to run LLMs locally. There are multiple web/desktop apps and
terminal integrations on top of it.

22.4 ChatUI: An open-source UI to use open-source models.

23. Quantization: A technique that allows us to reduce the size of a model. It is done by reducing the
precision of the model’s weights. For example, we can reduce the precision from 32 bits to 8 bits. This
reduces the size of the model by 4 times! The model will (sometimes) be less accurate but it will be
much smaller. This allows us to run the model on smaller devices such as phones.

23.1 TheBloke: A bloke that quantizes models. As soon as a model is out, he quantizes it! See their HF
Profile.

23.2 Hugging Face: A platform to find and share open-acces models, datasets, and demos. It’s also a
company that has built different OS libraries (and where I work!)

23.3. Facehugger: A monster from the Alien movie. It should also be an open source tool. It’s not yet.

23.4. GPTQ: A popular quantization technique.

23.5 AWQ: Another popular quantization technique.

23.6 EXL2: A different quantization format used by a library called exllamav2 (among many others)

23.7 LASER: A technique that reduces the size of the model and increases its performance by
reducindg the rank of specific matrices. It requires no additional training.

24. PEFT: Parameter-Efficient Fine-Tuning - It’s a family of methods that allow fine-tuning models without
modifying all the parameters. Usually, you freeze the model, add a small set of parameters, and just
modify it. It hence reduces the amount of compute required and you can achieve very good results!

24.1 peft: A popular OS library to do PEFT! It’s used in other projects such as trl .

24.2 adapters: Another popular library to do PEFT.

24.3.unsloth: A higher-level library to do PEFT (using QLoRA)

24.4. LoRA: One of the most popular PEFT techniques. It adds low-rank “update matrices”. The base
model is frozen and only the update matrices are trained. This can be used for image classification,
teaching Stable Diffusion the concept of your pet, or LLM fine-tuning.
25. QLoRA: A technique that combines LoRAs with quantization, hence we use 4-bit quantization and
only update the LoRA parameters! This allows fine-tuning models with very GPU-poor GPUs.

25.1. Tim Dettmers: A researcher that has done a lot of work on PEFT and created QLoRA.

25.2. Guanaco (model): A LLaMA fine-tune using QLoRA tuning.

26. axolotl: A cute animal that is also a high-level tool to streamline fine-tuning, including support for
things such as QLoRA.
27. Nous Research: An open-source Discord community turned company that releases bunch of cool
models.

28. Multimodal: A single model that can handle multiple modalities. For example, a model that can
generate text and images at the same time. Or a model that can generate text and audio at the same
time. Or a model that can generate text, images, and audio at the same time. Or a model that can
generate text, images, audio, video, smells, tastes, feelings, thoughts, dreams, memories,
consciousness, souls, universes, gods, multiverses, and omniverses at the same time. (thanks ChatGPT
for your hallucination)

28.1 Hallucination: When a model cangenerates responses that may be coherent but are not actually
accurate, leading to the creation of misinformation or imaginary scenarios…such as the one above!

28.2 LlaVA: A multimodal model that can receive images and text as input and generate text respones.

29. Bagel: A process which mixes a bunch of supervised fine-tuning and preference data. It uses different
prompt formats, making the model more versatile to all kinds of prompts.

30. Code Models: LLMs that are specifically pre-trained for code.

30.1. Big Code Models Leaderboard: A leaderboard to compare code models in the HumanEval
dataset.

30.2. HumanEval: A very small dataset of 164 Python programming problems. It is translated to 18
programming languages in MultiPL-E.

30.3 BigCode: An open scientific collaboration working in code-related models and datasets.

30.4 The Stack: A dataset of 6.4TB of permissible-licensed code data covering 358 programming
languages.

30.5 Code Llama: The best base code model. It’s based on Llama 2.
30.6 WizardLM: A research team from Microsoft…but also a Discord community.

30.7 WizardCoder: A code model released by WizardLM. Its architecture is based on Llama
31. Flash Attention: An approximate attention algorithm which provides a huge speedup.

31.1 Flash Attention 2: An upgrade to the flash attention algorithm that provides even more speedup.

31.2. Tri Dao: The author of both techniques and a legend in the ecosystem.

I hope you enjoyed this read! Feel free to suggest new terms or corrections in the comments below. I’ll
keep updating this post as new terms come up.

7 Comments - powered by utteranc.es

DrChrisLevy commented on Jan 21, 2024
This is amazing, thanks !
👍1

2404589803 commented on Jan 22, 2024

great job!

fpaupier commented on Jan 22, 2024

Great overview of the different concepts, discovered many! thanks @osanseviero
havenqi commented on Jan 23, 2024
great job! marking this post

FelikZ commented on Feb 6, 2024

Good stuff. Would be nice to have a dive into Embeddings and tooling around it.

sanzgadea commented on Mar 11, 2024

Good post! One comment is that Flash Attention is not an approximation of attention but it is
exact, meaning it computes the exact attention calculation. It achieves the speedup through
optimized memory access and parallel processing techniques.

sugatoray commented on May 18, 2024

This is an incredibly useful article. Thank you @osanseviero for maintaining this.
❤️ 1

Write Preview
Sign in to comment

Styling with Markdown is supported Sign in with GitHub

 Report an issue

M Waleed Kadous LLMs in Production Learning From Experience 1
No ratings yet
M Waleed Kadous LLMs in Production Learning From Experience 1
51 pages
Foundations of Large Language Models 1738142777
No ratings yet
Foundations of Large Language Models 1738142777
101 pages
Synthetic Data
No ratings yet
Synthetic Data
33 pages
Building LLMs - Stanford
No ratings yet
Building LLMs - Stanford
78 pages
Foundations of LLM
No ratings yet
Foundations of LLM
231 pages
Generative Ai Terminology
67% (3)
Generative Ai Terminology
26 pages
Predibase Fine-Tuning LLMs Ebook
No ratings yet
Predibase Fine-Tuning LLMs Ebook
20 pages
AI Glossary HB Basic
No ratings yet
AI Glossary HB Basic
21 pages
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
100% (3)
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
275 pages
Building AutoGPT With Llama 3.1
No ratings yet
Building AutoGPT With Llama 3.1
293 pages
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
100% (5)
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
326 pages
2411 03350v1
No ratings yet
2411 03350v1
76 pages
Llama 2 - Open Foundation and Fine-Tuned Chat Models
No ratings yet
Llama 2 - Open Foundation and Fine-Tuned Chat Models
76 pages
PythonAI LLMs ForSharing
No ratings yet
PythonAI LLMs ForSharing
47 pages
Genaitoolboxltslides 1736779963542
No ratings yet
Genaitoolboxltslides 1736779963542
38 pages
LLM - Introduction 2024
No ratings yet
LLM - Introduction 2024
77 pages
LLaMA 2
No ratings yet
LLaMA 2
77 pages
Sinan Ozdemir Quick Start Guide To Large Language Models Strategies
No ratings yet
Sinan Ozdemir Quick Start Guide To Large Language Models Strategies
285 pages
Belgian NLP Meetup Dec 2023
No ratings yet
Belgian NLP Meetup Dec 2023
32 pages
Code Llama: Open Foundation Models For Code
No ratings yet
Code Llama: Open Foundation Models For Code
48 pages
Path To The LLM & Generative AI
No ratings yet
Path To The LLM & Generative AI
12 pages
Fine Tune LLAMA
No ratings yet
Fine Tune LLAMA
20 pages
State of AI Report - 2024 ONLINE
100% (1)
State of AI Report - 2024 ONLINE
213 pages
A Survey On Large Language Models For Code Generation: Juyong Jiang, Fan Wang, Jiasi Shen, Sungju Kim, Sunghun Kim
No ratings yet
A Survey On Large Language Models For Code Generation: Juyong Jiang, Fan Wang, Jiasi Shen, Sungju Kim, Sunghun Kim
49 pages
Llama 2
No ratings yet
Llama 2
77 pages
Pretraining and Evaluation CodeLLMs
No ratings yet
Pretraining and Evaluation CodeLLMs
71 pages
Fine Tune Llama
No ratings yet
Fine Tune Llama
20 pages
WEN Echnical Eport
No ratings yet
WEN Echnical Eport
59 pages
If Design Trend Report 2025
No ratings yet
If Design Trend Report 2025
338 pages
LLaMA Ankit - Rawat
No ratings yet
LLaMA Ankit - Rawat
52 pages
Qwen Technical Report
No ratings yet
Qwen Technical Report
59 pages
LLMs Overview and OpenAI API Ver 1-8 - Final NLP Day-UM6P-Nov 2023
No ratings yet
LLMs Overview and OpenAI API Ver 1-8 - Final NLP Day-UM6P-Nov 2023
45 pages
Own Your AI - Tech Deck
No ratings yet
Own Your AI - Tech Deck
75 pages
SSRN 4504303
No ratings yet
SSRN 4504303
8 pages
Chapter 1. An Introduction To Generative Media: A Note For Early Release Readers
No ratings yet
Chapter 1. An Introduction To Generative Media: A Note For Early Release Readers
17 pages
代码大模型
No ratings yet
代码大模型
18 pages
Llama2 Extracted
No ratings yet
Llama2 Extracted
4 pages
Pytoch Modeling
No ratings yet
Pytoch Modeling
16 pages
Three 150224 Generative A I Intro
No ratings yet
Three 150224 Generative A I Intro
19 pages
Text Generation
No ratings yet
Text Generation
4 pages
Know Thy Frenemy
No ratings yet
Know Thy Frenemy
40 pages
Clase1 Generating Your First Text
No ratings yet
Clase1 Generating Your First Text
18 pages
A E C P T L L M: A P ' G: N Mpirical Ategorization of Rompting Echniques FOR Arge Anguage Odels Ractitioner S Uide
No ratings yet
A E C P T L L M: A P ' G: N Mpirical Ategorization of Rompting Echniques FOR Arge Anguage Odels Ractitioner S Uide
16 pages
Technical Seminar
No ratings yet
Technical Seminar
16 pages
Phil Wang Repos
No ratings yet
Phil Wang Repos
10 pages
Large Language Models and Where To Use Them - Part 1
No ratings yet
Large Language Models and Where To Use Them - Part 1
12 pages
Introducing Multimodal Llama 3.2
No ratings yet
Introducing Multimodal Llama 3.2
29 pages
Global Logic Interview Questions and Answers
No ratings yet
Global Logic Interview Questions and Answers
6 pages
Large Language Model Algorithms in Plain English
No ratings yet
Large Language Model Algorithms in Plain English
8 pages
Natural Language Processing
No ratings yet
Natural Language Processing
8 pages
How LLM's Work, How GPT Was Trained, and How GPT Generates Outputs
No ratings yet
How LLM's Work, How GPT Was Trained, and How GPT Generates Outputs
12 pages
Building Finetuning Aimodels
No ratings yet
Building Finetuning Aimodels
41 pages
Azure Openai Workshop: Microsoft Partner Project Ready
No ratings yet
Azure Openai Workshop: Microsoft Partner Project Ready
84 pages
SSRN Id4655822
No ratings yet
SSRN Id4655822
9 pages
Data Seminar
No ratings yet
Data Seminar
10 pages
Gong Et Al. (2025) - Safety Misalignment Against Large Language Models
No ratings yet
Gong Et Al. (2025) - Safety Misalignment Against Large Language Models
20 pages
Performance Analysis of LoRA Finetuning Llama-2
No ratings yet
Performance Analysis of LoRA Finetuning Llama-2
4 pages
Meta AI Guide
No ratings yet
Meta AI Guide
9 pages
A Comparative Study of DeepSeek and Other Ai Tools
No ratings yet
A Comparative Study of DeepSeek and Other Ai Tools
8 pages
Latest LLMs
No ratings yet
Latest LLMs
6 pages
A Comprehensive Review On Generative AI For Education
No ratings yet
A Comprehensive Review On Generative AI For Education
27 pages
Meta-Llama - Llama-4-Scout-17B-16E-Original Hugging Face
No ratings yet
Meta-Llama - Llama-4-Scout-17B-16E-Original Hugging Face
15 pages
META
No ratings yet
META
20 pages
Hardware Requirements Documentation For Large Language Models (LLMS)
No ratings yet
Hardware Requirements Documentation For Large Language Models (LLMS)
5 pages
Revolutionizing Talent Acquisition A Comparative Study of Large Language Models in Resume Classification
No ratings yet
Revolutionizing Talent Acquisition A Comparative Study of Large Language Models in Resume Classification
6 pages
State of AI Report 2025 - Current Status, Trends & Innovations in Artificial Intelligence
No ratings yet
State of AI Report 2025 - Current Status, Trends & Innovations in Artificial Intelligence
32 pages
AI-Powered Legal Chatbot For Litigative Situations: Leveraging BNS, BNSS, and BSA With Contextual Document Integration
No ratings yet
AI-Powered Legal Chatbot For Litigative Situations: Leveraging BNS, BNSS, and BSA With Contextual Document Integration
6 pages
Optimizing Large Language Model Training Using FP4 Quantization
No ratings yet
Optimizing Large Language Model Training Using FP4 Quantization
17 pages
Karthik Resume
No ratings yet
Karthik Resume
1 page
AI-Enhanced Honeypots: Leveraging LLM For Adaptive Cybersecurity Responses
No ratings yet
AI-Enhanced Honeypots: Leveraging LLM For Adaptive Cybersecurity Responses
6 pages
Autojudge: Judge Decoding Without Manual Annotation: Roman Garipov Fedor Velikonivtsev
No ratings yet
Autojudge: Judge Decoding Without Manual Annotation: Roman Garipov Fedor Velikonivtsev
18 pages
Persianmind:: A Cross-Lingual Persian-English Large Language Model
No ratings yet
Persianmind:: A Cross-Lingual Persian-English Large Language Model
13 pages
Llama Nemotron
No ratings yet
Llama Nemotron
24 pages
3 Months - RAG Expert
No ratings yet
3 Months - RAG Expert
6 pages
Omniscience: A Domain-Specialized LLM For Scientific Reasoning and Discovery
No ratings yet
Omniscience: A Domain-Specialized LLM For Scientific Reasoning and Discovery
27 pages
Llama 3
No ratings yet
Llama 3
12 pages
Efficient Tool Use With Chain-of-Abstraction Reasoning
No ratings yet
Efficient Tool Use With Chain-of-Abstraction Reasoning
17 pages
Meta Careers Do The Most Meaningful Work of Your Career Meta Careers
No ratings yet
Meta Careers Do The Most Meaningful Work of Your Career Meta Careers
1 page
Icpc 2025 Rene
No ratings yet
Icpc 2025 Rene
12 pages
Multi-Agent System For Research Summarization and Reporting With Crew AI - by Vikram Bhat - Dec, 202
No ratings yet
Multi-Agent System For Research Summarization and Reporting With Crew AI - by Vikram Bhat - Dec, 202
1 page
Introducing The Llama Startup Program - Hacker News
No ratings yet
Introducing The Llama Startup Program - Hacker News
2 pages
GitHub - Aimerou - Awesome-Ai-Papers - A Curated List of The Most Impressive AI Papers3
No ratings yet
GitHub - Aimerou - Awesome-Ai-Papers - A Curated List of The Most Impressive AI Papers3
1 page
The Beginner’s Guide to AI – Llama
From Everand
The Beginner’s Guide to AI – Llama
Steven Mcananey
No ratings yet
The Beginner’s Guide to Creating AI Chatbots
From Everand
The Beginner’s Guide to Creating AI Chatbots
Steven Mcananey
No ratings yet
Hands-on TinyML: Harness the power of Machine Learning on the edge devices (English Edition)
From Everand
Hands-on TinyML: Harness the power of Machine Learning on the edge devices (English Edition)
Rohan Banerjee
5/5 (1)
Building Transformer Models with PyTorch 2.0: NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)
From Everand
Building Transformer Models with PyTorch 2.0: NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)
Prem Timsina
No ratings yet
Demystifying Large Language Models: Unraveling the Mysteries of Language Transformer Models, Build from Ground up, Pre-train, Fine-tune and Deployment
From Everand
Demystifying Large Language Models: Unraveling the Mysteries of Language Transformer Models, Build from Ground up, Pre-train, Fine-tune and Deployment
James Chen
No ratings yet
Algorithm Challenges: The Dojo Collection
From Everand
Algorithm Challenges: The Dojo Collection
Martin Puryear
No ratings yet
Python for Mechanical and Aerospace Engineering
From Everand
Python for Mechanical and Aerospace Engineering
Alexander Kenan
No ratings yet