0% found this document useful (0 votes)
78 views13 pages

Building A Smarter RAG - Implementing Graph-Based RAG With Neo4j - by Vinay Jain - Nov, 2024 - Medium

Uploaded by

陳賢明
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views13 pages

Building A Smarter RAG - Implementing Graph-Based RAG With Neo4j - by Vinay Jain - Nov, 2024 - Medium

Uploaded by

陳賢明
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Open in app

44
Search

Get unlimited access to the best of Medium for less than $1/week. Become a member

Building a Smarter RAG: Implementing Graph-


based RAG with Neo4j
Vinay Jain · Follow
3 min read · 3 days ago

Listen Share More

Flowchart of Graph-RAG

Ever wondered how to make your RAG (Retrieval-Augmented Generation) system


understand relationships between pieces of information? In this article, I’ll build a
Graph-based RAG system using Neo4j that not only finds relevant information but
also understands how different pieces of data connect to each other.

What You’ll Learn


Setting up Neo4j for Graph-based RAG(https://neo4j.com/)

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 1/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Loading and processing Wikipedia data

Creating a knowledge graph from text

Implementing hybrid search (vector + graph)

Prerequisites
Before we dive in, make sure you have:

Python 3.10+

Neo4j Database (Community or Enterprise edition)

OpenAI API key

Basic understanding of RAG systems

Setting Up Your Environment


First, let’s install all required packages:

pip install langchain langchain-community langchain-openai langchain-experiment


pip install neo4j wikipedia tiktoken yfiles_jupyter_graphs pypdf

Set up your environment variables:

import os

os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["NEO4J_URI"] = "your-neo4j-uri"
os.environ["NEO4J_USERNAME"] = "neo4j"
os.environ["NEO4J_PASSWORD"] = "your-password"

Loading and Processing Data

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 2/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

We’ll use Wikipedia data about Microsoft as our example. Here’s how to load and
split the data:

from langchain.document_loaders import WikipediaLoader


from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load Wikipedia data


raw_documents = WikipediaLoader(query="Microsoft").load()

# Split into smaller chunks


text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
documents = text_splitter.split_documents(raw_documents)

Building the Knowledge Graph


Here’s where things get interesting. We’ll use LangChain’s graph transformer to
convert our text into a graph structure:

from langchain_openai import ChatOpenAI


from langchain_experimental.graph_transformers import LLMGraphTransformer

llm = ChatOpenAI(temperature=0)
llm_transformer = LLMGraphTransformer(llm=llm)

# Convert documents to graph format


graph_documents = llm_transformer.convert_to_graph_documents(documents)

# Add to Neo4j
graph.add_graph_documents(
graph_documents,
baseEntityLabel=True,
include_source=True
)

This code does something amazing: it automatically identifies entities and


relationships in your text and creates a graph structure. For example, if your text
mentions “Microsoft acquired GitHub”, it creates nodes for both companies and a
relationship “ACQUIRED” between them.

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 3/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Setting Up Hybrid Search


The power of our system comes from combining vector search with graph traversal.
Here’s how we set it up:

from langchain_community.vectorstores import Neo4jVector


from langchain_openai import OpenAIEmbeddings

vector_index = Neo4jVector.from_existing_graph(
OpenAIEmbeddings(),
search_type="hybrid",
node_label="Document",
text_node_properties=["text"],
embedding_node_property="embedding"
)

The Magic: Combining Graph and Vector Search


Here’s one of the most important parts of our system — the retriever that combines
both search methods:

def retriever(question: str):


structured_data = structured_retriever(question)
unstructured_data = [el.page_content for el in
vector_index.similarity_search(question)]

final_data = f"""Structured data:


{structured_data}
Unstructured data:
{"#Document ".join(unstructured_data)}
"""
return final_data

This retriever:

1. Uses graph relationships to find structured information

2. Uses vector similarity to find relevant text chunks

3. Combine both results for a comprehensive answer

Testing the System

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 4/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Let’s try our system with some questions:

chain.invoke({"question": "Which companies did Microsoft acquire?"})

The system will:

1. Look for company entities in the graph

2. Follow “ACQUIRED” relationships

3. Find relevant context from vector search

4. Combine everything into a complete answer

Here’s the colab notebook which contains the full code:


https://colab.research.google.com/drive/1zpk3UgGlLkJl8o46Ui6BHWQgPJM5eV4G?
usp=sharing

Why This Matters


Traditional RAG systems sometimes miss important connections between pieces of
information. By adding graph capabilities:

We can follow explicit relationships between entities

Questions about relationships become easier to answer

The system understands the context better

Limitations/Drawbacks of Graph-based RAG


While Graph-based RAG offers powerful capabilities, it’s important to understand its
limitations before implementation:

1. Higher Complexity

Requires managing both vector and graph databases

More complex setup and maintenance compared to traditional RAG

Steeper learning curve for development teams

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 5/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

2. Performance Overhead

Slower response times due to creation of nodes & relationships by LLM

Higher computational resources needed

More expensive due to additional LLM calls and storage requirements

3. Data Quality Challenges

System effectiveness depends on accurate entity extraction

Potential conflicts between graph and vector search results

Updating content requires syncing both vector and graph stores

Next Steps
You can enhance this system by:

Adding more relationship types

Implementing custom entity extraction

Adding time-based relationships

Creating visualization for the graph

Conclusion
Graph-based RAG combines the best of both worlds: the semantic understanding of
vector search and the explicit relationships of graph databases. This makes your
RAG system not just able to find information, but also understand how different
pieces of information connect to each other.

Graphrag Neo4j Llm Applications Chatbots For Business Openai Api

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 6/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Follow

Written by Vinay Jain


1 Follower

AI Developer | Python Developer | Helping people get started with AI

More from Vinay Jain

Vinay Jain

Building an Interactive Document Q&A System with Streamlit and


LangChain
Ever wanted to create an intelligent chatbot that answers questions based on uploaded
documents? In this article, we’ll walk through the…

4d ago

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 7/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Vinay Jain

Should Reservations in our Education System be ended?


Hello, I daily come across many people saying that because of reservations of seats, the merit
students are not able to get their dream…

Aug 7, 2022

Vinay Jain

Personalized Financial Health Advisor


A Brief Overview of My Idea:

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 8/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Jun 25 3

See all from Vinay Jain

Recommended from Medium

Samar Singh

LightRAG : A GraphRAG Alternative.


How to Set Up LightRAG Locally?

Oct 26 38

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 9/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Okan Yenigün in Dev Genius

LangChain in Chains #45: Web Scraping with LLMs


Integrating LLMs into Web Scraping Workflows Using LangChain

5d ago 4

Lists

Natural Language Processing


1798 stories · 1408 saves

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 10/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Tomaz Bratanic in Towards Data Science

Building Knowledge Graphs with LLM Graph Transformer


A deep dive into LangChain’s implementation of graph construction with LLMs

2d ago 364 3

Pankaj

How to Build a Local RAG Agent with LLaMA3 Using Graph-Based


Workflows
Discover how to create a local Retrieval-Augmented Generation (RAG) agent with adaptive
routing, fallback mechanisms, and self-correction…

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 11/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Oct 31 58

JingleMind.Dev

Mastering Advanced(RAG) Methods — GraphRAG with Neo4j |


Implementation with Langchain
Graph retrieval-augmented generation (GraphRAG) is gaining momentum and becoming a
powerful addition to traditional vector search…

Jul 30 30

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 12/13
2024/11/7 晚上10:59 Building a Smarter RAG: Implementing Graph-based RAG with Neo4j | by Vinay Jain | Nov, 2024 | Medium

Ignacio de Gregorio

Stanford Creates Linear Frontier LLMs for $20.


A team of Stanford University researchers has presented LoLCATs, a new method that
linearizes standard Transformer LLMs, drastically…

5d ago 694 7

See more recommendations

https://medium.com/@vinayjain449/building-a-smarter-rag-implementing-graph-based-rag-with-neo4j-570e105e2d4a 13/13

You might also like