Knowledge Graph
Knowledge Graph
Introduction
The ability to transform unstructured text into a structured knowledge graph is a
game-changing advancement in data processing and information retrieval.
Knowledge graphs represent relationships and entities in a way that is both
human-readable and machine-interpretable, enabling a range of applications such as
semantic search, recommendation systems, and data-driven insights.
This blog outlines the step-by-step process to build and visualize a knowledge graph
using Neo4j and Python. With Neo4j as the graph database and Python libraries like
LlamaIndex for text processing, we will extract meaningful entities and relationships
from raw text and visualize them in Neo4j. Whether you are a data scientist, software
developer, or enthusiast, this guide will help you get started with knowledge graph
construction.
OPENAI_API_KEY=your_openai_api_key
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_neo4j_password
NEO4J_DATABASE=neo4j
● Download and install Neo4j Desktop or Neo4j Server from the official Neo4j website.
● Follow the instructions for your platform to install Neo4j Desktop or Neo4j Server.
Steps to set up Neo4j and ensure proper connection:
1. Download and install Neo4j Desktop
2. Install APOC plugin: Go to the "Manage" screen, then the "Plugins" tab. Click
"Install" in the APOC box.
3. Configure Neo4j: Open the “neo4j.conf” file
Uncomment or add these lines:
server.bolt.enabled=true
server.bolt.listen_address=:7687
import os
from dotenv import load_dotenv
from llama_index.core import Document, Settings, StorageContext
from llama_index.core import KnowledgeGraphIndex
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core.vector_stores.simple import SimpleVectorStore
from llama_index.graph_stores.neo4j import Neo4jGraphStore
from neo4j import GraphDatabase
load_dotenv()
openai_api_key = os.getenv("OPENAI_API_KEY")
neo4j_uri = os.getenv("NEO4J_URI")
neo4j_user = os.getenv("NEO4J_USER")
neo4j_password = os.getenv("NEO4J_PASSWORD")
def connect_to_neo4j():
Define a function to clear all nodes and relationships in the Neo4j database.
def clear_database():
print("Database cleared.")
Define a function to create a knowledge graph from the input text using LlamaIndex and
store it in Neo4j.
def create_and_store(text):
documents = [Document(text=text)]
graph_store = Neo4jGraphStore(username=neo4j_user,
password=neo4j_password, url=neo4j_uri, database="neo4j")
vector_store = SimpleVectorStore()
storage_context =
StorageContext.from_defaults(graph_store=graph_store,
vector_store=vector_store)
index = KnowledgeGraphIndex.from_documents(documents,
storage_context=storage_context, max_triplets_per_chunk=10)
index.storage_context.graph_store.persist(persist_path="D:/knowledge_gr
aph")
query_engine = index.as_query_engine(include_text=False,
response_mode="tree_summarize")
node_count = result.single()["node_count"]
Define a function to retrieve nodes and relationships from the Neo4j database for
visualisation.
def visualise():
def create_knowledge_graph(text):
clear_database()
create_and_store(text)
Provide an example text and run the knowledge graph creation process.
text = """
"""
create_knowledge_graph(text)
driver.close()
python your_script_name.py
Text:
Albert Einstein was a theoretical physicist who developed the theory of relativity.
Text: The Industrial Revolution, which began in the late 18th century, marked a major
turning point in history. This period saw a shift from manual labour and animal-based
production to machine-based manufacturing. It started in Great Britain and quickly
spread to other parts of Europe and North America. The revolution brought about
significant technological, socioeconomic, and cultural changes.
Key innovations during this time included the steam engine, developed by James
Watt, which became a primary source of power for the new factories. The textile
industry saw remarkable advancements with inventions like the spinning jenny by
James Hargreaves and the power loom by Edmund Cartwright. These innovations
dramatically increased production capacity and efficiency.
The period also saw significant scientific advancements. The work of scientists like
Michael Faraday in electromagnetics laid the groundwork for future technological
developments. Charles Darwin's theory of evolution by natural selection, published in
'On the Origin of Species,' revolutionised the field of biology and our understanding of
life on Earth.
While the Industrial Revolution brought about unprecedented economic growth and
technological progress, it also had negative consequences. Environmental pollution
increased dramatically, and social inequalities became more pronounced. These
issues continue to shape discussions about sustainable development and social
justice in the modern world.
The legacy of the Industrial Revolution continues to influence our lives today, from
the way we work and travel to how we understand the relationship between
technology, society, and the environment.