0% found this document useful (0 votes)

259 views18 pages

MultiModel RAG

Uploaded by

film's master

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

259 views18 pages

MultiModel RAG

Uploaded by

film's master

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

multimodel_rag

December 11, 2024

1 Setup
[ ]: %pip install -U "unstructured[all-docs]" pillow lxml pillow
%pip install -U chromadb tiktoken
%pip install -U langchain langchain-community langchain-openai langchain-groq
%pip install -U python_dotenv
%pip install -U langchain-ollama
%pip install -U transformers
%pip install -qU "langchain-chroma>=0.1.2"
%pip install -qU langchain-openai
%pip install PyMuPDF

[1]: from dotenv import load_dotenv

load_dotenv()

[1]: True

2 Data Extraction from PDF

2.1 Partition PDF tables, text, and images
[2]: from unstructured.partition.pdf import partition_pdf

output_path = "./pdf/"
file_path = output_path + 'attention_is_all_you_need.pdf'

# Reference: https://docs.unstructured.io/open-source/core-functionality/
↪chunking

chunks = partition_pdf(
filename=file_path,
infer_table_structure=True, # extract tables
strategy="hi_res", # mandatory to infer tables

extract_image_block_types=["Image"], # Add 'Table' to list to extract␣

↪image of tables
# image_output_dir_path=output_path, # if None, images and tables will␣
↪saved in base64

1
extract_image_block_to_payload=True, # if true, will extract base64 for␣
↪API usage

chunking_strategy="by_title", # or 'basic'
max_characters=10000, # defaults to 500
combine_text_under_n_chars=2000, # defaults to 0
new_after_n_chars=6000,

# extract_images_in_pdf=True, # deprecated
)

/Users/a2024/miniforge3/envs/multimodelrag/lib/python3.12/site-
packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update
jupyter and ipywidgets. See
https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm

[3]: len(chunks)

[3]: 17

[4]: [str(type(x)) for x in chunks]

[4]: ["<class 'unstructured.documents.elements.CompositeElement'>",

"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.Table'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.Table'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.Table'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.Table'>",
"<class 'unstructured.documents.elements.CompositeElement'>",
"<class 'unstructured.documents.elements.CompositeElement'>"]

[5]: #the valuable data tables and compositeElement

set([str(type(el)) for el in chunks])

[5]: {"<class 'unstructured.documents.elements.CompositeElement'>",

"<class 'unstructured.documents.elements.Table'>"}

2
[ ]: chunks[0].metadata.orig_elements

[7]: chunks[0].metadata.orig_elements[0].to_dict()

[7]: {'type': 'UncategorizedText',

'element_id': '8089b0b0-9217-46cf-8d41-cb455ec7164f',
'text': '3',
'metadata': {'coordinates': {'points': ((45.388888888888886,
594.2222222222224),
(45.388888888888886, 622.0000000000002),
(100.94444444444446, 622.0000000000002),
(100.94444444444446, 594.2222222222224)),
'system': 'PixelSpace',
'layout_width': 1700,
'layout_height': 2200},
'last_modified': '2024-12-10T22:44:55',
'filetype': 'PPM',
'languages': ['eng'],
'page_number': 1}}

[ ]: chunks[1].to_dict()

[ ]: elements = chunks [3].metadata.orig_elements

chunk_images = [el for el in elements if 'Image' in str(type(el))]
chunk_images[0].to_dict()

2.2 Separte the Text, Tables and Images

[11]: # separate tables from texts
tables = []
texts = []

for chunk in chunks:

if "Table" in str(type(chunk)):
tables.append(chunk)

if "CompositeElement" in str(type((chunk))):
texts.append(chunk)

[ ]: tables[0].to_dict()

[13]: tables[0].metadata.text_as_html

[13]: '<table><tr><td>Layer Type</td><td>Complexity per Layer</td><td>Sequential

Operations</td><td>Maximum Path Length</td></tr><tr><td>Self-
Attention</td><td>O(n? -
d)</td><td>O(1)</td><td>O(1)</td></tr><tr><td>Recurrent</td><td>O(n- d?)</td><td

3
>O(n)</td><td>O(n)</td></tr><tr><td>Convolutional</td><td>O(k-n-
d?)</td><td>O(1)</td><td>O(logy(n))</td></tr><tr><td>Self-Attention
(restricted)</td><td>O(r-n-d)</td><td>ol)</td><td>O(n/r)</td></tr></table>'

[14]: # Get the images from the CompositeElement objects

def get_images_base64(chunks):
images_b64 = []
for chunk in chunks:
if "CompositeElement" in str(type(chunk)):
chunk_els = chunk.metadata.orig_elements
for el in chunk_els:
if "Image" in str(type(el)):
images_b64.append(el.metadata.image_base64)
return images_b64

images = get_images_base64(chunks)

2.3 Display Image

[15]: import base64
from IPython.display import Image, display

def display_base64_image(base64_code):
# Decode the base64 string to binary
image_data = base64.b64decode(base64_code)
# Display the image
display(Image(data=image_data))

display_base64_image(images[0])

4
5
3 Summary of the Data
3.1 Text and Table Summaries
[16]: from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_ollama.llms import OllamaLLM

# Prompt
prompt_text = """
You are an assistant tasked with summarizing tables and text.
Give a concise summary of the table or text.

Respond only with the summary, no additionnal comment.

Do not start your message by saying "Here is a summary" or anything like that.
Just give the summary as it is.

Table or text chunk: {element}

"""
prompt = ChatPromptTemplate.from_template(prompt_text)

# Summary chain
model = OllamaLLM(temperature=0.5, model="llama3.1:8b")
summarize_chain = {"element": lambda x: x} | prompt | model | StrOutputParser()

[17]: # Summarize text

text_summaries = summarize_chain.batch(texts, {"max_concurrency": 3})

# Summarize tables
tables_html = [table.metadata.text_as_html for table in tables]
table_summaries = summarize_chain.batch(tables_html, {"max_concurrency": 3})

[ ]: text_summaries

[ ]: table_summaries

3.2 Image Summaries

[20]: from langchain_openai import OpenAI

from langchain_openai import ChatOpenAI

prompt_template = """Describe the image in detail. For context,

6
the image is part of a research paper explaining the␣
↪ transformers
architecture. Be specific about graphs, such as bar plots."""
messages = [
(
"user",
[
{"type": "text", "text": prompt_template},
{
"type": "image_url",
"image_url": {"url": "data:image/jpeg;base64,{image}"},
},
],
)
]

prompt = ChatPromptTemplate.from_messages(messages)

chain = prompt | ChatOpenAI(model="gpt-4o-mini") | StrOutputParser()

image_summaries = chain.batch(images)

[ ]: image_summaries

3.3 Vetor Storage

3.4 Creation of vector Store
[22]: import uuid
from langchain_chroma import Chroma
from langchain.storage import InMemoryStore
from langchain_core.documents import Document
from langchain_openai import OpenAIEmbeddings
from langchain.retrievers.multi_vector import MultiVectorRetriever

# The vectorstore to use to index the child chunks

vectorstore = Chroma(collection_name="multi_modal_rag",␣
↪embedding_function=OpenAIEmbeddings())

# The storage layer for the parent documents

store = InMemoryStore()
id_key = "doc_id"

# The retriever (empty to start)

retriever = MultiVectorRetriever(
vectorstore=vectorstore,

7
docstore=store,
id_key=id_key,
)

3.5 Load the summaries and link the to the original data
[23]: # Add texts
doc_ids = [str(uuid.uuid4()) for _ in texts]
summary_texts = [
Document(page_content=summary, metadata={id_key: doc_ids[i]}) for i,␣
↪summary in enumerate(text_summaries)

]
retriever.vectorstore.add_documents(summary_texts)
retriever.docstore.mset(list(zip(doc_ids, texts)))

# Add tables
table_ids = [str(uuid.uuid4()) for _ in tables]
summary_tables = [
Document(page_content=summary, metadata={id_key: table_ids[i]}) for i,␣
↪summary in enumerate(table_summaries)

]
retriever.vectorstore.add_documents(summary_tables)
retriever.docstore.mset(list(zip(table_ids, tables)))

# Add image summaries

img_ids = [str(uuid.uuid4()) for _ in images]
summary_img = [
Document(page_content=summary, metadata={id_key: img_ids[i]}) for i,␣
↪summary in enumerate(image_summaries)

]
retriever.vectorstore.add_documents(summary_img)
retriever.docstore.mset(list(zip(img_ids, images)))

3.6 Retrivel Analysis

[34]: chunks = retriever.invoke(
"what is multihead attention"
)

[ ]: chunks

[ ]: display_base64_image(chunks[1])

8
[ ]: for chunk in chunks:
print(chunk)

[39]: for i, chunk in enumerate(chunks):

if "CompositeElement" in str(type(chunk)):
print("\n\nChunk", i)
for doc in chunk.metadata.orig_elements:
print(doc.to_dict()["type"], doc.metadata.page_number)

Chunk 0
Title 4
NarrativeText 4
NarrativeText 4
UncategorizedText 4
NarrativeText 5
NarrativeText 5
Formula 5
NarrativeText 5
NarrativeText 5
Title 5
NarrativeText 5
ListItem 5
ListItem 5
ListItem 5

9
Chunk 2
ListItem 12
ListItem 12
ListItem 12
ListItem 12
ListItem 12
Footer 12
Title 13
Image 13
FigureCaption 13
Header 13
Image 14
NarrativeText 14
UncategorizedText 14
Image 15
Image 15
FigureCaption 15
Header 15

Chunk 3
Title 3
NarrativeText 3
Footer 3
Image 4
Image 4
NarrativeText 4
NarrativeText 4
Title 4
NarrativeText 4
NarrativeText 4
Formula 4
NarrativeText 4
NarrativeText 4

[40]: import fitz

import matplotlib.patches as pataches
import matplotlib.pyplot as plt
from PIL import Image

def plot_pdf_with_boxes(pdf_page, segments):

pix = pdf_page.get_pixmap()
pil_image = Image.frombytes('RGB', [pix.width, pix.height], pix.samples)

fig, ax = plt.subplots(1, figsize=(10, 10))

ax.imshow(pil_image)
categorites = set()

10
category_to_color = {
'Title': 'orchid',
'Image':'forestgreen',
'Table':'tomato',
}

for segment in segments:

points = segment['coordinates']['points']
layout_width = segment["coordinates"]['layout_width']
layout_height = segment['coordinates']['layout_height']
scaled_points = [
(x * pix.width / layout_width, y * pix.height / layout_height)
for x, y in points
]
box_color = category_to_color.get(segment['category'], 'deepskyblue')
categorites.add(segment['category'])
rect = pataches.Polygon(
scaled_points, linewidth=1, edgecolor=box_color, facecolor='none'
)
ax.add_patch(rect)

#Legend
legend_handles = [pataches.Patch(color='deepskyblue', label='Text')]
for category in ['Title', 'Image', 'Table']:
if category in categorites:
legend_handles.append(
pataches.Patch(color=category_to_color[category],␣
↪label=category)

)
ax.axis('off')
ax.legend(handles=legend_handles, loc='upper right')
plt.tight_layout()
plt.show()

def render_page(doc_list: list, page_number: int, print_text=True) -> None:

pdf_page = fitz.open(file_path).load_page(page_number - 1)
page_docs = [
doc for doc in doc_list if doc.metadata.get('page_number') ==␣
↪page_number

]
segments = [doc.metadata for doc in page_docs]
plot_pdf_with_boxes(pdf_page=pdf_page, segments=segments)
if print_text:
for doc in page_docs:
print(f'{doc.page_content}\n')

11
[43]: from langchain_core.documents import Document
def extract_page_numbers_from_chunk(chunk):
elements = chunk.metadata.orig_elements

page_numbers = set()
for element in elements:
page_numbers. add (element.metadata.page_number)
return page_numbers

def display_chunk_pages (chunk):

page_numbers = extract_page_numbers_from_chunk(chunk)
docs = []
for element in chunk.metadata.orig_elements:
metadata = element.metadata.to_dict()
if "Table" in str(type (element)):
metadata ["category"] = "Table"
elif "Image" in str(type(element) ):
metadata ["category"] = "Image"
else:
metadata ["category"] = "Text"
metadata ["page_number"] = int (element.metadata.page_number)

docs. append (Document( page_content=element.text, metadata=metadata))

for page_number in page_numbers:

render_page(docs, page_number, False)

extract_page_numbers_from_chunk(chunks[3])
display_chunk_pages(chunks[3])

12
13
14
4 RAG Pipeline
[44]: from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_openai import ChatOpenAI
from base64 import b64decode

def parse_docs(docs):
"""Split base64-encoded images and texts"""
b64 = []
text = []
for doc in docs:
try:
b64decode(doc)
b64.append(doc)
except Exception as e:
text.append(doc)
return {"images": b64, "texts": text}

def build_prompt(kwargs):

docs_by_type = kwargs["context"]
user_question = kwargs["question"]

context_text = ""
if len(docs_by_type["texts"]) > 0:
for text_element in docs_by_type["texts"]:
context_text += text_element.text

# construct prompt with context (including images)

prompt_template = f"""
Answer the question based only on the following context, which can include␣
↪text, tables, and the below image.

Context: {context_text}
Question: {user_question}
"""

prompt_content = [{"type": "text", "text": prompt_template}]

if len(docs_by_type["images"]) > 0:
for image in docs_by_type["images"]:
prompt_content.append(
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image}"},

15
}
)

return ChatPromptTemplate.from_messages(
[
HumanMessage(content=prompt_content),
]
)

chain = (
{
"context": retriever | RunnableLambda(parse_docs),
"question": RunnablePassthrough(),
}
| RunnableLambda(build_prompt)
| ChatOpenAI(model="gpt-4o-mini")
| StrOutputParser()
)

chain_with_sources = {
"context": retriever | RunnableLambda(parse_docs),
"question": RunnablePassthrough(),
} | RunnablePassthrough().assign(
response=(
RunnableLambda(build_prompt)
| ChatOpenAI(model="gpt-4o-mini")
| StrOutputParser()
)
)

[45]: response = chain.invoke(

"What is the attention mechanism?"
)

print(response)

The attention mechanism, specifically the "Scaled Dot-Product Attention," is a

method that computes a weighted sum of values based on the compatibility of
queries with keys. The key steps involved are:

1. Input Matrices: It takes three matrices as input—queries (Q), keys (K),

and values (V), all of which are vectors that represent different aspects of the
input data.

2. **Dot Products**: The dot products of the queries and keys are computed,
scaled by the square root of the dimension of the keys (√dk).

16
3. **Softmax**: A softmax function is applied to the scaled dot products to
obtain the attention weights, which indicate the importance of each value based
on its corresponding key.

4. **Weighted Sum**: Finally, these weights are used to compute a weighted sum
of the values (V), resulting in the output of the attention mechanism.

This attention mechanism allows the model to focus on relevant parts of the
input sequence for each output element, enabling it to capture relationships and
dependencies effectively.

Additionally, the multi-head attention expands this concept by running several

attention functions in parallel to gather information from different
representation subspaces.

[47]: response = chain_with_sources.invoke(

"What is multihead?"
)

print("Response:", response['response'])

#Context

# print("\n\nContext:")
# for text in response['context']['texts']:
# print(text.text)
# print("Page number: ", text.metadata.page_number)
# print("\n" + "-"*50 + "\n")

for image in response['context']['images']:

display_base64_image(image)

Response: Multi-head attention is a mechanism that extends the standard

attention function by using multiple attention heads in parallel. Instead of
computing a single set of attention scores, multi-head attention projects the
input queries, keys, and values into multiple lower-dimensional spaces. Each
attention head then performs the attention function independently, allowing the
model to capture different aspects of the input data simultaneously.

The outputs from all the attention heads are concatenated and linearly
transformed to produce the final output. This approach enables the model to
attend to information from different representation subspaces at various
positions, enhancing its ability to learn complex patterns within the data.

17
5 THANK YOU!

Anthropic-cookbook:Skills:Contextual-embeddings:Guide - Ipynb at Main Anthropics
No ratings yet
Anthropic-cookbook:Skills:Contextual-embeddings:Guide - Ipynb at Main Anthropics
21 pages
Guide Ipynb
No ratings yet
Guide Ipynb
26 pages
PythonAI VisionModels ForSharing
No ratings yet
PythonAI VisionModels ForSharing
41 pages
Multimodel Text
No ratings yet
Multimodel Text
9 pages
RAG With Reinforcement Learning
No ratings yet
RAG With Reinforcement Learning
40 pages
F) Maybe Is Full Script Complet
No ratings yet
F) Maybe Is Full Script Complet
35 pages
Revolutionizing Document Ingestion & RAG With Docling, Azure AI Search, and Azure OpenAI
No ratings yet
Revolutionizing Document Ingestion & RAG With Docling, Azure AI Search, and Azure OpenAI
12 pages
(English) Python RAG Tutorial (With Local LLMS) - AI For Your PDFs (DownSub - Com)
No ratings yet
(English) Python RAG Tutorial (With Local LLMS) - AI For Your PDFs (DownSub - Com)
15 pages
Creating An AI That Can Operate Without An Internet Connection
No ratings yet
Creating An AI That Can Operate Without An Internet Connection
11 pages
How I Built A Basic RAG For PDF QA in A Few Lines of Python Code - by DR Julija - Medium
No ratings yet
How I Built A Basic RAG For PDF QA in A Few Lines of Python Code - by DR Julija - Medium
8 pages
Wa0029.
No ratings yet
Wa0029.
11 pages
Claude Comparet DB
No ratings yet
Claude Comparet DB
8 pages
C) Le Script But Not Complet Partie 1
No ratings yet
C) Le Script But Not Complet Partie 1
13 pages
RAG Application Using Open Source Tools 1721123882
No ratings yet
RAG Application Using Open Source Tools 1721123882
5 pages
CVDL Tae 63
No ratings yet
CVDL Tae 63
9 pages
Notes - by Kishor
No ratings yet
Notes - by Kishor
11 pages
Demo
No ratings yet
Demo
3 pages
02 Data Connections
No ratings yet
02 Data Connections
32 pages
QA Using Gemini Langchain ChromaDB PDF
No ratings yet
QA Using Gemini Langchain ChromaDB PDF
2 pages
Labsheet 9
No ratings yet
Labsheet 9
2 pages
Chatbot Code
No ratings yet
Chatbot Code
2 pages
Mechanical Sciences Symposium 2025
No ratings yet
Mechanical Sciences Symposium 2025
1 page
Introduction
No ratings yet
Introduction
17 pages
How To Analyze A PDF With The Layout-Parser Package. - by Brendan Ferris - Towards Data Science
No ratings yet
How To Analyze A PDF With The Layout-Parser Package. - by Brendan Ferris - Towards Data Science
3 pages
Tutorials Sources Beginner Ptcheat
No ratings yet
Tutorials Sources Beginner Ptcheat
7 pages
Langchain App Design
No ratings yet
Langchain App Design
7 pages
Building RAG Apps
No ratings yet
Building RAG Apps
32 pages
Chatbot Code
No ratings yet
Chatbot Code
2 pages
Image Caption2
No ratings yet
Image Caption2
9 pages
Python Scripts
No ratings yet
Python Scripts
5 pages
Unlocking Rapid Data Extraction: Groq + OCR and Claude Vision - by Júlio Almeida - Python in Plain E
No ratings yet
Unlocking Rapid Data Extraction: Groq + OCR and Claude Vision - by Júlio Almeida - Python in Plain E
17 pages
SigmaXL Version 8 Workbook
No ratings yet
SigmaXL Version 8 Workbook
541 pages
Chatbot Code
No ratings yet
Chatbot Code
2 pages
MCE - 5 Published BS and BS ISO Stds
No ratings yet
MCE - 5 Published BS and BS ISO Stds
5 pages
Chapter 1 - Guided Notes To Trigonometry
No ratings yet
Chapter 1 - Guided Notes To Trigonometry
10 pages
5-10-1 Truolivescultivars Accessions Autochtones
No ratings yet
5-10-1 Truolivescultivars Accessions Autochtones
72 pages
Class Test
No ratings yet
Class Test
10 pages
Solved ISRO Scientist or Engineer Mechanical May 2017 Paper With Solutions
No ratings yet
Solved ISRO Scientist or Engineer Mechanical May 2017 Paper With Solutions
26 pages
A New Approach To Current Differential Protection For Transmission Lines
No ratings yet
A New Approach To Current Differential Protection For Transmission Lines
25 pages
Enotes
No ratings yet
Enotes
30 pages
Math Teaching PDF
No ratings yet
Math Teaching PDF
114 pages
1 Complex Numbers
No ratings yet
1 Complex Numbers
9 pages
Dire Dawa University Institute of Technology School of Computing Department of Computer Science
No ratings yet
Dire Dawa University Institute of Technology School of Computing Department of Computer Science
13 pages
12 PGTRB Maths Study Material Vector Differentiation
No ratings yet
12 PGTRB Maths Study Material Vector Differentiation
11 pages
Testbank For Precalculus 11th Edition Larson
No ratings yet
Testbank For Precalculus 11th Edition Larson
17 pages
English 1
No ratings yet
English 1
20 pages
Expt 4 Conclusion and Applications
0% (2)
Expt 4 Conclusion and Applications
2 pages
COMPUTER SCIENCE (M SC (CS) )
No ratings yet
COMPUTER SCIENCE (M SC (CS) )
14 pages
Multi, Square & Percentage
No ratings yet
Multi, Square & Percentage
6 pages
IP MODEL 1 QST Set 2
No ratings yet
IP MODEL 1 QST Set 2
4 pages
Qdoc - Tips Inverted Pendulum
No ratings yet
Qdoc - Tips Inverted Pendulum
10 pages
Aoa Practicals
No ratings yet
Aoa Practicals
25 pages
Latest DLL Math 4 WK 8
No ratings yet
Latest DLL Math 4 WK 8
2 pages
Frames of References 5th Sem Nep
No ratings yet
Frames of References 5th Sem Nep
16 pages
Solution 1
No ratings yet
Solution 1
6 pages
Math Set 1
No ratings yet
Math Set 1
9 pages
第七單元細線化與骨架抽取
No ratings yet
第七單元細線化與骨架抽取
13 pages
Computers & Fluids: Tapan K. Sengupta, Himanshu Singh, Swagata Bhaumik, Rajarshi R. Chowdhury
No ratings yet
Computers & Fluids: Tapan K. Sengupta, Himanshu Singh, Swagata Bhaumik, Rajarshi R. Chowdhury
12 pages
Maths Class Ix Sample Paper 01 Blue Prints For Annual Exam 2023 1
No ratings yet
Maths Class Ix Sample Paper 01 Blue Prints For Annual Exam 2023 1
1 page
Balancing Hard To Balance Equations PDF
No ratings yet
Balancing Hard To Balance Equations PDF
2 pages
Effect of Friction Coefficient On Finite Element Modeling of The Deep - Cold Rolling Process
No ratings yet
Effect of Friction Coefficient On Finite Element Modeling of The Deep - Cold Rolling Process
5 pages
Cahpter 8 Lecture 2 Dimensional Analysis PDF
No ratings yet
Cahpter 8 Lecture 2 Dimensional Analysis PDF
5 pages
Python: Learn Python in 24 Hours
From Everand
Python: Learn Python in 24 Hours
Alex Nordeen
4/5 (12)
Pyqt6 101: A Beginner’s Guide to PyQt6
From Everand
Pyqt6 101: A Beginner’s Guide to PyQt6
Edward Chang
No ratings yet
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
From Everand
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
Adam Freeman
No ratings yet
Simplifying Data Science With Python
From Everand
Simplifying Data Science With Python
Billy David millican
No ratings yet
10 Lessons in Front-end
From Everand
10 Lessons in Front-end
Krasimir Tsonev
2/5 (1)
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Software Design Simplified
From Everand
Software Design Simplified
Liviu Catalin Dorobantu
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Easy Programming for Everyone
From Everand
Easy Programming for Everyone
Umar Asghar
No ratings yet
NgRx SignalStore: An effortless solution for state management
From Everand
NgRx SignalStore: An effortless solution for state management
Abdelfattah Ragab
No ratings yet
Quick Python Guide
From Everand
Quick Python Guide
Coder1
No ratings yet
Angular Generative AI: Building an intelligent CV enhancer with Google Gemini
From Everand
Angular Generative AI: Building an intelligent CV enhancer with Google Gemini
Abdelfattah Ragab
No ratings yet
Firebase Storage for Angular: A reliable file upload solution for your applications
From Everand
Firebase Storage for Angular: A reliable file upload solution for your applications
Abdelfattah Ragab
No ratings yet
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
C# Interview Questions, Answers, and Explanations: C Sharp Certification Review
From Everand
C# Interview Questions, Answers, and Explanations: C Sharp Certification Review
equitypress
4.5/5 (3)
Blazor and API Example: Classroom Quiz Application
From Everand
Blazor and API Example: Classroom Quiz Application
Taurius Litvinavicius
No ratings yet
Fresher PyQt5: A Beginner’s Guide to PyQt5
From Everand
Fresher PyQt5: A Beginner’s Guide to PyQt5
Edward Chang
No ratings yet
Python and SQLite Development
From Everand
Python and SQLite Development
Agus Kurniawan
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Microsoft Visual Basic Interview Questions: Microsoft VB Certification Review
From Everand
Microsoft Visual Basic Interview Questions: Microsoft VB Certification Review
Equity Press
No ratings yet
Inspiring Powershell Articles
From Everand
Inspiring Powershell Articles
Murat Yildirimoglu
No ratings yet

MultiModel RAG

Uploaded by

MultiModel RAG

Uploaded by

multimodel_rag

December 11, 2024

[1]: from dotenv import load_dotenv

2 Data Extraction from PDF

extract_image_block_types=["Image"], # Add 'Table' to list to extract␣

[4]: [str(type(x)) for x in chunks]

[4]: ["<class 'unstructured.documents.elements.CompositeElement'>",

[5]: #the valuable data tables and compositeElement

[5]: {"<class 'unstructured.documents.elements.CompositeElement'>",

[7]: {'type': 'UncategorizedText',

[ ]: elements = chunks [3].metadata.orig_elements

2.2 Separte the Text, Tables and Images

for chunk in chunks:

[13]: '<table><tr><td>Layer Type</td><td>Complexity per Layer</td><td>Sequential

[14]: # Get the images from the CompositeElement objects

2.3 Display Image

Respond only with the summary, no additionnal comment.

Table or text chunk: {element}

[17]: # Summarize text

3.2 Image Summaries

from langchain_openai import ChatOpenAI

prompt_template = """Describe the image in detail. For context,

chain = prompt | ChatOpenAI(model="gpt-4o-mini") | StrOutputParser()

3.3 Vetor Storage

# The vectorstore to use to index the child chunks

# The storage layer for the parent documents

# The retriever (empty to start)

# Add image summaries

3.6 Retrivel Analysis

[39]: for i, chunk in enumerate(chunks):

[40]: import fitz

def plot_pdf_with_boxes(pdf_page, segments):

fig, ax = plt.subplots(1, figsize=(10, 10))

for segment in segments:

def render_page(doc_list: list, page_number: int, print_text=True) -> None:

def display_chunk_pages (chunk):

docs. append (Document( page_content=element.text, metadata=metadata))

for page_number in page_numbers:

# construct prompt with context (including images)

prompt_content = [{"type": "text", "text": prompt_template}]

[45]: response = chain.invoke(

The attention mechanism, specifically the "Scaled Dot-Product Attention," is a

1. **Input Matrices**: It takes three matrices as input—queries (Q), keys (K),

Additionally, the multi-head attention expands this concept by running several

[47]: response = chain_with_sources.invoke(

for image in response['context']['images']:

Response: Multi-head attention is a mechanism that extends the standard

You might also like

1. Input Matrices: It takes three matrices as input—queries (Q), keys (K),