0% found this document useful (0 votes)

23 views20 pages

Learning Graph DB in One Night - Neo4j - by Prashant Mudgal - Towards Data Science

Uploaded by

eaintkyawthmu1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views20 pages

Learning Graph DB in One Night - Neo4j - by Prashant Mudgal - Towards Data Science

Uploaded by

eaintkyawthmu1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Member-only story

Learning Graph DB in one night —

Neo4j
I had 1 night to decide whether spending a large amount of time on
Graph database will be fruitful.

Prashant Mudgal · Follow

Published in Towards Data Science · 7 min read · Dec 23, 2020

I spent a large amount of time last year on developing a recommendation

system for a Telecom client’s users; It turned out to be a massively difficult
problem to undertake and accomplish in a short stipulated time. I was faced
with a similar-sized problem last week and had a short quick around time for
devising an initial strategy. I was well aware of the landmines that the data-
driven methods have so I wanted to test another approach.
Image by Author( One can create beautiful art with context-free grammar). I chose this photograph because it
gels well with the graphs, networks, and hallucinations of the night!

Last year, someone had mentioned Neo4j DB for Recommendation System

but I didn’t pay much heed to it. I heard about Neo4j for the first time in 2016
when I downloaded it along with the Panama Papers data to expose the
shameless tax avoiders and owners of the overseas shelf companies from my
country. I ran a query here and there for an hour and a half while sitting at

Open in app
Open in app
Cafe Jax on 84th St. but eventually, it met the fate of most hobby projects —
swept underSearch
the carpet in an oblivious dimension. Write

Fast forward to propitious 2020, I decided to delve a little into Graph

databases with the following factors to consider:

1. Run a truncated business problem rather than some tutorial.

2. Can I get meaningful information in a short amount of time?

3. How scalabale this entire thing is on my moderately powerful machine?

4. How flexible it is in comparison to the pythonic approach? Data manipulation,

feature generation etc

I started at around 7:30 pm, shortly after dinner and finished at around 5:20
am. I relied heavily on the docs and examples on the documentation
website.

1. Download and Install

https://neo4j.com/download/ is where one can find the installer; one has to
fill a small form before it lets us proceed to download(another data
collection ploy).

After installation, you will be welcomed with open arms in the Neo4j
community(at least that’s what the prompts on the screens say) and if you
don’t have time for faffing around, you will diligently close all such
paraphernalia and get to business straight away.

2. Set up a Graph DB
I am sure you can create a new Project and in that, you will have to create a
database. Click on Add Database

Image by Author

Image by Author
Give a username and password of your choice and click on Start. Once it is
running, click on Open.

3. Data prep and Data location

First of all where to place your data? If you are using macOS then
/Users/<your user folder>/Library/Application
Support/com.Neo4j.Relate/Data/dbmss/<folder related to the DB you created
above>/import/

Place your .csv files in the import folder.

<folder related to the DB you created above> — If it is your first project then
you will have only one folder under /dbmss, so place your .csv in the /import
there nonchalantly and audaciously.

(Only for mac users: The above folder is much easier to find on Windows or
Linux as in macOS the /Users/<your user folder>/Library is hidden, so you
can type /Users/<your user folder>/Library in spotlight search and get to the
folders)

I scrubbed my data heavily and took only 1% of it for the experiment.

You can get all the .csv files from the GitHub here.

Service_Providers.csv contains Telecom service provider specialising in one of

the Telco product such as Fiber, DTH, 4G LTE etc.

Uses.csv maps the Service Provider in the above file to Major Telecom
players(known as Local Partners) in different Geographies.
Similar.csv has data on which major Telecom players are similar to each other.

4. Formulate problem statement in terms of data above

With the help of Neo4j, Data sources described above, the tooth fairy, and
black magic, can we recommend service providers and products to the Major
Telecom players in this B2B setting?

5. Let’s play

In step 2, you had opened the Neo4j browser. It looks something like this.

We can type commands next to neo4j$ prompt.

Just like there is SQL in the universe, neo4j has its own language CQL called
Cypher Query Language. I won’t call it much of a pain but I touched only a
small portion of it, so what do I know?

With the three CSV files in place, I ran the following to create the nodes and
relationships.
LOAD CSV WITH HEADERS FROM "file:///service_providers.csv" AS row
MERGE (pName:provider_name {name: row.Provider})
MERGE (pGeog:provider_loc {name: row.Geography})
MERGE (pServs:provider_serv {name: row.Services})
MERGE (pName)-[:Located_In]->(pGeog)
MERGE (pName)-[:provides]->(pServs)
LOAD CSV WITH HEADERS FROM "file:///uses.csv" AS row
MERGE (clientN:client_Name {name: row.Local_Partner})
MERGE (pName:provider_name {name: row.Provider})
MERGE (clientN)-[:Uses]->(pName)
LOAD CSV WITH HEADERS FROM "file:///similar.csv" AS row
MERGE (clientN:client_Name {name: row.Local_Partner})
MERGE (userN:client_Name {name: row.User})
MERGE (clientN)-[:Is_Similar]->(userN)

Image by Author
The sidebar of the database will have the information of all the nodes that
are created and all the relationships that exist between the node.

These nodes are queried upon and the relationships are used as filters in the
CQL.

#FunTimesBegin

Run the following command the neo4j prompt

Match(n) Return(n)
Image by Author

This is neat!

The visual representation tells me who is connected to whom with what

underlying relationship. Such visuals can be great for storytelling and the
business audience.
I suspect that this graph will look really messy when the number of nodes is
high.

6. Recommendations
This graph contains all the info of the data and we would use CQL to unearth
those relationships. We can find similar entities, what do they have in
common, what products do they use etc.

Let’s take the case of ‘Boston Locals’ which is one of the Major Telecom
Player(known as Local Partner).

#Other partners similar to ‘Boston Locals’

MATCH (boston:client_Name{name:"Boston Locals"})-[:Is_Similar]-

(client_Name)
RETURN client_Name.name

Image by Author

Two other Major Players are similar to Boston Locals.

#Find products and local providers that are used by similar major players.
MATCH (boston:client_Name {name:"Boston Locals"}),
(boston)-[:Is_Similar]-(partner),
(provider:provider_name)-[:Located_In]->(provider_loc),
(provider)-[:provides]->(provider_serv),
(partner)-[:Uses]->(provider)
RETURN provider.name, provider_loc.name, collect(partner.name),
provider_serv.name, count(*) as count
ORDER BY count DESC

In the above query, collect function will create a list of partners.

Image by Author

The above work in Neo4j works as what is called Collaborative Filtering in

the Recommendation Systems space. One finds the similarity between
items, users, user-items and uses the space to recommend items, products,
or services.

This isn’t sophisticated as embeddings, neural networks, matrix factorisation

but if the problem isn’t esoteric then why not go for a simpler solution!

7. Pythonic ways
It turns out that neo4j can interact with python via a driver.

pip install neo4j

Once that’s done you can easily call neo4j current DB session in python
file(make sure that DB is running otherwise you will get ServiceException
errors)

from neo4j import GraphDatabase

uri = "neo4j://localhost:7687"
user = "neo4j"
password = "hello@123"
driver = GraphDatabase.driver(uri, auth=(user, password))
session = driver.session()

Then you can define a function that uses the above session to run queries.
The file is available on my Github here.
One can look at the recommendations through a simple print statement.

Image by Author
It’s morning already!
After an initial litmus test and a tiring night, I was pleasantly surprised with
the results and the capabilities of Neo4j.

For the questions that I intended to find answers to:

1. Can I get meaningful information in a short amount of time?

Definitely! The visual information is advantage in understanding the deeper

relationships in the data. It also helps in the vernacular that is easily explainable
and comprehensible with the data.

2. How scalabale this entire thing is on my moderately powerful machine?

I ran it on my machine with 16 G RAM, 512 G HD, i7 6 Core; I tried running a

file with 200K rows and 5 columns (all numeric data) and I got Java heap space
error, decreased the file size but kept on getting the error till 70K rows. I can easily
use pandas dataframe or turicreate’s Sframe without batting an eyelid on those
files on my machine. So, at the moment I am skeptical of scalability.

3. How flexible it is in comparison to the pythonic approach? Data

manipulation, feature generation etc.

Here I used a classic use case which can be solved with basic manipulations but in
an indusstrial setting with increasing complexity, merely similarity doesn’t yield
good results. One needs to concoct feature spaces such as embeddings which is
possible in Neo4j but I haven’t explored that. Neo4j Graph Data Science shows
promise.
At this moment, I would like to include Neo4j in my Data Science life cycle
during the exploratory data analysis phase to form the hypotheses that I can
test using the usual pythonic ways.

With the help of CQL, I can find all the records that exhibit certain
characteristics and I can test the consistency of the results obtained from the
classical methods.

Epilogue: It was a productive night, time to sleep!

Photo by Jonathan Fink on Unsplash

Data Science Programming Machine Learning Python Neo4j

Written by Prashant Mudgal Follow

270 Followers · Writer for Towards Data Science

LinkedIn — shorturl.at/sI289 ; Other blog — https://bit.ly/3AVJ1rE ; Interested in science,

maths, startup, and films. Management consultant and data scientist

More from Prashant Mudgal and Towards Data Science

Prashant Mudgal in Towards AI Cristian Leo in Towards Data Science

Who is Responsible for Climate The Math Behind Neural Networks

Change? — A Graphical Approach Dive into Neural Networks, the backbone of
A Data-driven approach to the global modern AI, understand its mathematics,…
warming issue

· 14 min read · Oct 28, 2023 28 min read · Mar 28, 2024

470 7 1.7K 13

Alex Honchar in Towards Data Science Prashant Mudgal in ILLUMINATION

Intro to LLM Agents with An analytical way to choose your

Langchain: When RAG is Not… next earphones
First-order principles of brain structure for AI The science behind the earphones and the
assistants parameters involved

7 min read · Mar 15, 2024 · 16 min read · May 19, 2023

1.7K 8 4 1
See all from Prashant Mudgal See all from Towards Data Science

Recommended from Medium

Plaban Nayak in AI Planet Tomaz Bratanic

Implement RAG with Knowledge Constructing knowledge graphs

Graph and Llama-Index from text using OpenAI functions
Hallucination is a common problem when Seamlessy implement information extraction
working with large language models (LLMs).… pipeline with LangChain and Neo4j

25 min read · Dec 3, 2023 11 min read · Oct 20, 2023

1K 9 1.4K 10

Lists

Predictive Modeling w/ Practical Guides to Machine

Python Learning
20 stories · 1066 saves 10 stories · 1276 saves
Coding & Development General Coding Knowledge
11 stories · 543 saves 20 stories · 1084 saves

Wenqi Glantz in Better Programming Builescu Daniel in Python in Plain English

7 Query Strategies for Navigating My Boss Laughed at Python…Then I

Knowledge Graphs With… Showed Him This
Exploring NebulaGraph RAG Pipeline with the And Streamlined My Data Analysis Workflow
Philadelphia Phillies

· 17 min read · Sep 29, 2023 · 5 min read · Mar 28, 2024

935 4 1.3K 13

Kasper Junge Anthony Alca… in Artificial Intelligence in Plain En…

How to Use Neo4J with Python Enriching Language Models with

Introduction Knowledge Graphs for Powerful…
Retrieval-augmented generation (RAG) has
emerged as a vital technique to enhance lar…

2 min read · Dec 13, 2023 · 7 min read · Feb 19, 2024
57 801 2

See more recommendations

Neo4j Manual
50% (2)
Neo4j Manual
529 pages
Neo4j Manual Milestone
No ratings yet
Neo4j Manual Milestone
448 pages
Neo 4 J
No ratings yet
Neo 4 J
29 pages
Neo4j Graph Analytics
No ratings yet
Neo4j Graph Analytics
20 pages
Neo4j - Quick Guide
No ratings yet
Neo4j - Quick Guide
147 pages
Neo4j - Graph Database PDF
No ratings yet
Neo4j - Graph Database PDF
19 pages
Neo4j Manual PDF
No ratings yet
Neo4j Manual PDF
334 pages
Neo4j Manual Stable
No ratings yet
Neo4j Manual Stable
514 pages
Mini Project Blood Bank and Donor Management System-Documentation
100% (1)
Mini Project Blood Bank and Donor Management System-Documentation
24 pages
Introtoneo4jwebinar331 160331235041
No ratings yet
Introtoneo4jwebinar331 160331235041
117 pages
NOSQL Practical - 6 - To - 8
No ratings yet
NOSQL Practical - 6 - To - 8
61 pages
Building Graphs
No ratings yet
Building Graphs
42 pages
Building A Knowledge Graph - From Extracted Data To Connected Intelligence - by Dream AI - Nov, 2024 - Medium
No ratings yet
Building A Knowledge Graph - From Extracted Data To Connected Intelligence - by Dream AI - Nov, 2024 - Medium
28 pages
Neo 4 J
No ratings yet
Neo 4 J
16 pages
Introduction To Neo4j
No ratings yet
Introduction To Neo4j
8 pages
Neo 4 J
100% (1)
Neo 4 J
4 pages
Graph Database
No ratings yet
Graph Database
92 pages
Neo4j-Manual-2 0 1 PDF
No ratings yet
Neo4j-Manual-2 0 1 PDF
593 pages
CST8276 - Lab 10 - Working With Graph Databases
No ratings yet
CST8276 - Lab 10 - Working With Graph Databases
10 pages
Beginnerpresentation 120429104540 Phpapp01
No ratings yet
Beginnerpresentation 120429104540 Phpapp01
30 pages
Neo4j-Manual-2 0 0
No ratings yet
Neo4j-Manual-2 0 0
591 pages
PR 6 No SQL
No ratings yet
PR 6 No SQL
10 pages
CST8276 Lab 9 Raman
No ratings yet
CST8276 Lab 9 Raman
11 pages
Neo4j Basics To Advanced Full
No ratings yet
Neo4j Basics To Advanced Full
11 pages
Neo4j-Manual-2 0 1
No ratings yet
Neo4j-Manual-2 0 1
593 pages
NOSQL Micro Project
No ratings yet
NOSQL Micro Project
42 pages
Lecture02 GraphDatabases Neo4J PDF
No ratings yet
Lecture02 GraphDatabases Neo4J PDF
95 pages
ADO Lecture IX 2023-25
No ratings yet
ADO Lecture IX 2023-25
44 pages
SQL 7
No ratings yet
SQL 7
18 pages
DBMS Unit4
No ratings yet
DBMS Unit4
28 pages
R23 IDS Unit4 PPT - 2.0
No ratings yet
R23 IDS Unit4 PPT - 2.0
38 pages
Unit 5 Nosql
No ratings yet
Unit 5 Nosql
72 pages
Online AppQ HR Q1-Q30
No ratings yet
Online AppQ HR Q1-Q30
30 pages
GraphDB Recommendations en
No ratings yet
GraphDB Recommendations en
7 pages
Modeling A Recommendation Engine Workshop
No ratings yet
Modeling A Recommendation Engine Workshop
94 pages
No SQL
No ratings yet
No SQL
13 pages
Noslu 5 Edit
No ratings yet
Noslu 5 Edit
35 pages
NoSQL - PRACTICAL 7
No ratings yet
NoSQL - PRACTICAL 7
12 pages
2011 Webber-A Programmatic Introduction To Neo4j
No ratings yet
2011 Webber-A Programmatic Introduction To Neo4j
66 pages
To NEO4J: Abhishek Kumar
No ratings yet
To NEO4J: Abhishek Kumar
13 pages
Introduction To Data Science UNIT - IV
No ratings yet
Introduction To Data Science UNIT - IV
45 pages
BIG Data Analytics 21CSH-471: Computer Science & Engineering
No ratings yet
BIG Data Analytics 21CSH-471: Computer Science & Engineering
21 pages
Neo4j Notes
No ratings yet
Neo4j Notes
10 pages
Graph Databases For SQL Server Professionals
No ratings yet
Graph Databases For SQL Server Professionals
34 pages
9 Neo4j
No ratings yet
9 Neo4j
8 pages
Introduction To Neo4j
No ratings yet
Introduction To Neo4j
14 pages
Bda Experiment 3: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
No ratings yet
Bda Experiment 3: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
5 pages
M11a1 Final
No ratings yet
M11a1 Final
7 pages
Seminar On Neo4j Data Model
No ratings yet
Seminar On Neo4j Data Model
5 pages
Neo 4 J
No ratings yet
Neo 4 J
10 pages
Learning Guide 2: Nosql and Newsql: Cloud Computing Databases
No ratings yet
Learning Guide 2: Nosql and Newsql: Cloud Computing Databases
23 pages
216-219, Tesma0802, IJEAST
No ratings yet
216-219, Tesma0802, IJEAST
4 pages
Neo4j PDF
No ratings yet
Neo4j PDF
30 pages
Building Web Applications With Python and Neo4j - Sample Chapter
No ratings yet
Building Web Applications With Python and Neo4j - Sample Chapter
29 pages
Unit 4
No ratings yet
Unit 4
4 pages
Neo4j Use Case Social
No ratings yet
Neo4j Use Case Social
3 pages
Neo4j Cookbook - Sample Chapter
No ratings yet
Neo4j Cookbook - Sample Chapter
31 pages
INS Assignments
No ratings yet
INS Assignments
3 pages
Entity Authentication
No ratings yet
Entity Authentication
30 pages
Cit353 Summary With Past Question 2
No ratings yet
Cit353 Summary With Past Question 2
26 pages
Apple 820-3588-A
No ratings yet
Apple 820-3588-A
86 pages
Lab Manual
No ratings yet
Lab Manual
118 pages
Lecture 11
No ratings yet
Lecture 11
85 pages
The Limit of A Function PDF
No ratings yet
The Limit of A Function PDF
12 pages
Hns Determine Beest Fit Topology
No ratings yet
Hns Determine Beest Fit Topology
25 pages
2024 09 Exam SRM Syllabus
No ratings yet
2024 09 Exam SRM Syllabus
6 pages
5G Network Emulation Solutions Catalog
No ratings yet
5G Network Emulation Solutions Catalog
23 pages
Tiktok Auto
No ratings yet
Tiktok Auto
36 pages
C++ Operator Overloading 2
No ratings yet
C++ Operator Overloading 2
38 pages
63944en3 PDF
No ratings yet
63944en3 PDF
784 pages
Immediate Download Harnessing The Uefi Shell Moving The Platform Beyond Dos 2nd Edition Michael Rothman Ebooks 2024
100% (3)
Immediate Download Harnessing The Uefi Shell Moving The Platform Beyond Dos 2nd Edition Michael Rothman Ebooks 2024
55 pages
Digital Photography
No ratings yet
Digital Photography
2 pages
Chetan Sap Complete
No ratings yet
Chetan Sap Complete
15 pages
Lesson 2
No ratings yet
Lesson 2
18 pages
Practice 1 Getting Started With Spreadsheets
No ratings yet
Practice 1 Getting Started With Spreadsheets
29 pages
Joel Repport
No ratings yet
Joel Repport
33 pages
File Protection Mechanisms
No ratings yet
File Protection Mechanisms
16 pages
Solution To Password Math
No ratings yet
Solution To Password Math
8 pages
Science - Passage 5-6
No ratings yet
Science - Passage 5-6
2 pages
Advanced Search For Etenders - Eproposals
No ratings yet
Advanced Search For Etenders - Eproposals
2 pages
HL Paper1
No ratings yet
HL Paper1
15 pages
Nitin Gond Resume
No ratings yet
Nitin Gond Resume
1 page
Practicial 1 To 7,10,11,12 by Jas
No ratings yet
Practicial 1 To 7,10,11,12 by Jas
30 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-N
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-N
3 pages
Hadcp Datasheet LR
No ratings yet
Hadcp Datasheet LR
2 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
3 pages
How To Add A Right Margin To The Visual Studio Code Editor?: Stack Overflow
No ratings yet
How To Add A Right Margin To The Visual Studio Code Editor?: Stack Overflow
6 pages

Learning Graph DB in One Night - Neo4j - by Prashant Mudgal - Towards Data Science

Uploaded by

Learning Graph DB in One Night - Neo4j - by Prashant Mudgal - Towards Data Science

Uploaded by

Member-only story

Learning Graph DB in one night —

Prashant Mudgal · Follow

I spent a large amount of time last year on developing a recommendation

Last year, someone had mentioned Neo4j DB for Recommendation System

Fast forward to propitious 2020, I decided to delve a little into Graph

1. Run a truncated business problem rather than some tutorial.

2. Can I get meaningful information in a short amount of time?

3. How scalabale this entire thing is on my moderately powerful machine?

4. How flexible it is in comparison to the pythonic approach? Data manipulation,

1. Download and Install

3. Data prep and Data location

Place your .csv files in the import folder.

I scrubbed my data heavily and took only 1% of it for the experiment.

Service_Providers.csv contains Telecom service provider specialising in one of

4. Formulate problem statement in terms of data above

We can type commands next to neo4j$ prompt.

Run the following command the neo4j prompt

The visual representation tells me who is connected to whom with what

#Other partners similar to ‘Boston Locals’

MATCH (boston:client_Name{name:"Boston Locals"})-[:Is_Similar]-

Two other Major Players are similar to Boston Locals.

In the above query, collect function will create a list of partners.

The above work in Neo4j works as what is called Collaborative Filtering in

This isn’t sophisticated as embeddings, neural networks, matrix factorisation

pip install neo4j

from neo4j import GraphDatabase

For the questions that I intended to find answers to:

1. Can I get meaningful information in a short amount of time?

Definitely! The visual information is advantage in understanding the deeper

2. How scalabale this entire thing is on my moderately powerful machine?

I ran it on my machine with 16 G RAM, 512 G HD, i7 6 Core; I tried running a

3. How flexible it is in comparison to the pythonic approach? Data

Epilogue: It was a productive night, time to sleep!

Photo by Jonathan Fink on Unsplash

Data Science Programming Machine Learning Python Neo4j

270 Followers · Writer for Towards Data Science

LinkedIn — shorturl.at/sI289 ; Other blog — https://bit.ly/3AVJ1rE ; Interested in science,

More from Prashant Mudgal and Towards Data Science

Who is Responsible for Climate The Math Behind Neural Networks

Alex Honchar in Towards Data Science Prashant Mudgal in ILLUMINATION

Intro to LLM Agents with An analytical way to choose your

Recommended from Medium

Plaban Nayak in AI Planet Tomaz Bratanic

Implement RAG with Knowledge Constructing knowledge graphs

25 min read · Dec 3, 2023 11 min read · Oct 20, 2023

Predictive Modeling w/ Practical Guides to Machine

Wenqi Glantz in Better Programming Builescu Daniel in Python in Plain English

7 Query Strategies for Navigating My Boss Laughed at Python…Then I

Kasper Junge Anthony Alca… in Artificial Intelligence in Plain En…

How to Use Neo4J with Python Enriching Language Models with

See more recommendations

You might also like