You're reading from RAG-Driven Generative AI Build custom retrieval augmented generation pipelines with LlamaIndex, Deep Lake, and Pinecone

Product type Paperback

Published in Sep 2024

Publisher Packt

ISBN-13 9781836200918

Length 338 pages

Edition 1st Edition

Languages

Python

Tools

Docker

Concepts

GPT/LLMs

Author (1):

Denis Rothman

View More author details

Table of Contents (14) Chapters

Preface

1. Why Retrieval Augmented Generation? FREE CHAPTER

2. RAG Embedding Vector Stores with Deep Lake and OpenAI

3. Building Index-Based RAG with LlamaIndex, Deep Lake, and OpenAI

4. Multimodal Modular RAG for Drone Technology

5. Boosting RAG Performance with Expert Human Feedback

6. Scaling RAG Bank Customer Data with Pinecone

7. Building Scalable Knowledge-Graph-Based RAG with Wikipedia API and LlamaIndex

8. Dynamic RAG with Chroma and Hugging Face Llama

9. Empowering AI Models: Fine-Tuning RAG Data and Human Feedback

10. RAG for Video Stock Production with Pinecone and OpenAI

11. Other Books You May Enjoy

12. Index

Appendix

Index

Activeloop

URL 40

Activeloop Deep Lake 32, 33

adaptive RAG 118-120

selection system 125

advanced RAG 4, 21

index-based search 24

vector search 22

Agricultural Marketing Service (AMS) 203

AI-generated video dataset 263

diffusion transformer model video dataset, analyzing 266

diffusion transformer. working 264-266

Amazon Web Services (AWS) 146

Apollo program

reference link 41

augmented generation, RAG pipeline 50, 51

augmented input 53, 54

input and query retrieval 51-53

bag-of-words (BoW) 221

Bank Customer Churn dataset

collecting 146-151

environment, installing for Kaggle 148, 149

exploratory data analysis 151-153

ML model, training 154

preparing 146-148

Chroma 214, 215

Chroma collection

completions, embedding 220, 221

completions, storing 220, 221

data, embedding 218, 219

data, upserting 218, 219

embeddings, displaying 221

model, selecting 219

content generation 132-134

cosine similarity

implementing, to measure similarity between user input and generative AI model's output 56-58

data embedding and storage, RAG pipeline 44, 45

batch of prepared documents, retrieving 45, 46

data, adding to vector store 47, 48

embedding function 47

vector store, creating 46

vector store information 48-50

vector store, verifying for existence 46

data embeddings 33

data, for upsertion

preparing 193, 194

dataset

downloading 239

preparing, for fine-tuning 239-242

visualizing 240

Davies-Bouldin index 156

Deep Lake API

reference link 48

Deep Lake vector store

creating 194

populating 194

diffusion transformer model video dataset

analyzing 266

thumbnails and videos, displaying 270-272

video download and display functions 266-268

video file 268-270

documents

collecting 188

preparing 188

dynamic RAG

applications 210

architecture 210-212

collection, deleting 230, 231

collection, querying 222-225

dataset, downloading 216, 217

dataset, preparing 216, 217

environment, installing 212

prompt 225

prompt response 227

query result, retrieving 227

session time, activating 215, 216

total session time 231

using, with Llama 227-230

dynamic RAG environment installation

Chroma 214, 215

Hugging Face 213, 214

embedding models, OpenAI

reference link 47

embeddings 32

entry-level advanced RAG

coding 9

entry-level modular RAG

coding 9

entry-level naïve RAG

coding 9

environment

installing 238, 239

environment setup, RAG pipeline 36

authentication process 39, 40

components, in installation process 36, 37

drive, mounting 37

installation packages 36

libraries 36

requisites, installing 39

subprocess, creating to download files from GitHub 37, 38

evaluator 8, 134

cosine similarity score 134

human-expert evaluation 137-140

human feedback 9

human user rating 135-137

metrics 9

response time 134

fine-tuned OpenAI model

using 246-248

fine-tuning

dataset, preparing for 239-242

versus RAG 4

fine-tuning documentation, OpenAI

reference link 248

fine-tuning static RAG data

architecture 236, 237

foundations and basic implementation

data, setting up with list of documents 13

environment, installing 10, 11

generator function, using GPT-4o 11-13

query, for user input 13-15

Galileo (spacecraft)

reference link 42

generative AI environment

installing 131, 132

generator 8, 124

augmented input with HF 8

content generation 132-134

generation and output (G4) 8

generative AI environment, installing 131, 132

HF-RAG for augmented document inputs, integrating 125, 126

input 8, 126

mean ranking simulation scenario 126

prompt engineering (G3) 8

Generator and Commentator 263, 273, 274

AI-generated video dataset 263

frames, commenting on 275-277

Pipeline 1 controller 277

video, displaying 274

videos, spitting into frames 274, 275

GitHub 261

Hubble Space Telescope

reference link 41

Hugging Face 213, 214

reference link 213

hybrid adaptive RAG

building, in Python 120

generator 124

retriever 121

index-based RAG 62

architecture 62-64

index-based search 21, 24, 25, 62

augmented input 25

feature extraction 25

generation 26

versus vector-based search 64

International Space Station (ISS)

reference link 41

Juno (spacecraft)

reference link 41

Kaggle

reference link 148

Kepler space telescope

reference link 42

keyword index query engine 74, 85, 86

performance metric 87

knowledge-graph-based semantic search

graph, building from trees 185-187

RAG architecture, using for 182-185

knowledge graph index-based RAG 195-197

example metrics 202-203

functions, defining 200

generating 196, 197

graph, displaying 198, 199

interacting 199, 200

metrics calculation 204-206

metrics display 204-206

re-ranking 201

similarity score packages, installing 200

knowledge graphs 181

Large Language Model (LLM) 3

list index query engine 74, 83, 84

performance metric 84, 85

Llama

using, with dynamic RAG 227-230

LLM dataset

loading 93-95

LLM query engine

initializing 95

textual dataset, querying 96

user input, for multimodal modular RAG 95

machine learning (ML) 146, 215

Mars rover

reference link 41

mean ranking simulation scenario

human-expert feedback RAG 128-130

no human-expert feedback RAG 130, 131

no RAG 127

metadata

retrieving 188-192

metrics

analyzing, of training process and model 249-251

metrics, fine-tuned models

reference link 249

ML model, training 154

clustering evaluation 156-158

clustering implementation 156-158

data preparation and clustering 154-156

modular RAG 4, 26-28

strategies 28

multimodal dataset structure

bounding boxes, adding 100-103

image, displaying 100

image, saving 100-104

image, selecting 99

navigating 99

multimodal modular RAG 90-92

building, for drone technology 93

performance metric 110

user input 95

multimodal modular RAG, performance metric 110

LLM 110

multimodal 111-113

overall performance 113

multimodal modular RAG program, for drone technology

building 93, 108-110

LLM dataset, loading 93-95

multimodal dataset, loading 96-99

multimodal dataset structure, navigating 99

multimodal dataset, visualizing 96-99

multimodal query engine, building 104

performance metric 110

multimodal query engine

building 104

creating 104, 105

query, running on VisDrone multimodal dataset 106

response, processing 106, 107

source code image, selecting 107, 108

vector index, creating 104, 105

naïve RAG 4

augmented input 20

example, creating with 18

generation 20, 21

keyword search and matching 18, 19

metrics 19, 20

ONNX

reference link 215

OpenAI 261, 262

URL 39

OpenAI model

fine-tunes, monitoring 244-246

fine-tuning 242, 243

for embedding 159

for generation 159

Pinecone constraints 159

Open Neural Network Exchange (ONNX) 215

Pinecone 262, 263

reference link 172

used, for scaling 144

Pinecone index

querying 282-286

Pinecone index (vector store)

challenges 159, 160

creating 166, 167

data, duplicating 165, 166

dataset, chunking 162

dataset, embedding 163-165

dataset, processing 161, 162

environment, installing 160

querying 170-172

scaling 158

upserting 168-170

Pipeline 1 controller 277-279

comments, saving 279, 280

files, deleting 280

Python

used, for building hybrid adaptive RAG 120

RAG architecture

for video production 256-258

RAG ecosystem 237, 238

domains 5-7

evaluator component 8

generator component 8

retriever component 7

trainer component 9

RAG framework

advanced RAG 4

generator 4

modular RAG 4

naive RAG 4

retriever 4

RAG generative AI 172

augmented generation 176-178

augmented prompt 176

relevant texts, extracting 175

using, with GPT-4o 172

RAG pipeline 33, 34

augmented generation 35, 50, 51

building, steps 36

components 34

data collection 35, 40-42

data embedding and storage 35, 44, 45

data preparation 35, 40-44

environment setup 36

reasons, for component approach 34

RAG, with GPT-4o 172, 173

dataset, querying 173

target vector, querying 173, 175

Retrieval Augmented Generation (RAG) 1-3, 50

non-parametric 4

parametric 4

versus fine-tuning 4, 5

retrieval metrics 15

cosine similarity 15, 16

enhanced similarity 17

retriever 121

data, processing 122, 123

dataset, preparing 121

environment, installing 121

user input process 123, 124

retriever component

collect 7

process 7

retrieval query 8

storage 7

scaling, with Pinecone 144

architecture 144-146

semantic index-based RAG program

building 64, 65

cosine similarity metric 75, 76

Deep Lake vector store, creating 69-74

Deep Lake vector store, populating 69-74

documents collection 65-69

documents preparation 65-69

environment, installing 65

implementing 74

query parameters 75

user input 75

session time

activating 215, 216

Silhouette score 156

space exploration

reference link 41

SpaceX

reference link 41

Term Frequency-Inverse Document Frequency (TF-IDF) 15, 57, 134

trainer 9

training loss 251

tree index query engine 74, 80-82

performance metric 83

upserting process

reference link 168

user interface (UI) 124

vector-based search

versus index-based search 64

vector search 21

augmented input 23

generation 23

metrics 22, 23

vector similarity

reference link 167

Vector Store Administrator 281, 282

Pinecone index, querying 282-286

vector store index query engine 74-76

optimized chunking 79

performance metric 79, 80

query response and source 77, 78

vector stores 33

Video Expert 286-291

video production ecosystem, environment 259

GitHub 261

modules and libraries, importing 260, 261

OpenAI 261

Pinecone 262, 263

Voyager program

reference link 42

Wikipedia data

retrieving 188-192

The rest of the chapter is locked

You're reading from RAG-Driven Generative AI Build custom retrieval augmented generation pipelines with LlamaIndex, Deep Lake, and Pinecone

Table of Contents (14) Chapters

Index

Authors (1)

Personalised recommendations for you

You're reading from RAG-Driven Generative AI Build custom retrieval augmented generation pipelines with LlamaIndex, Deep Lake, and Pinecone

Table of Contents (14) Chapters

Index

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you