0% found this document useful (0 votes)
177 views26 pages

Learning Cypher Sample Chapter

Chapter No. 3 Manipulating the Database Write powerful and efficient queries for Neo4j with Cypher, its official query language

Uploaded by

Packt Publishing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
177 views26 pages

Learning Cypher Sample Chapter

Chapter No. 3 Manipulating the Database Write powerful and efficient queries for Neo4j with Cypher, its official query language

Uploaded by

Packt Publishing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Learning Cypher

Onofrio Panzarino








Chapter No. 3
"Manipulating the Database"
In this package, you will find:
A Biography of the author of the book
A preview chapter from the book, Chapter NO.3 "Manipulating the Database"
A synopsis of the books content
Information on where to buy this book








About the Author
Onofrio Panzarino is a programmer with 15 years experience working with various
languages (mostly with Java), platforms, and technologies. Before obtaining his Master
of Science degree in Electronics Engineering, he worked as a digital signal processor
programmer. Around the same time, he started working as a C++ developer for
embedded systems and PCs. Currently, he is working with Android, ASP.NET or C#,
and JavaScript for Wolters Kluwer Italia. During these years, he gained a lot of
experience with graph databases, particularly with Neo4j.
Onofrio resides in Ancona, Italy. His Twitter handle is (@onof80). He is a speaker in
the local Java user group and also a technical writer, mostly for Scala and NoSQL. In
his spare time, he loves playing the piano with his family and programming with
functional languages.
First and foremost, I would like to thank my wife, Claudia, and my
son, Federico, who patiently supported me at all times.
Special thanks to the team at Packt Publishing. It has been a great
experience to work with all of you. The work of all the reviewers
was invaluable as well.
I would also like to thank all my friends who read my drafts and
gave me useful suggestions.


For More Information:
www.packtpub.com/learning-cypher/book


Learning Cypher
Among the NoSQL databases, Neo4j is generating a lot of interest due to the following
set of features: performance and scalability, robustness, its very natural and expressive
graph model, and ACID transactions with rollbacks. Neo4j is a graph database. Its model
is simple and based on nodes and relationships.
The model is described as follows:
Each node can have a number of relationships with other nodes
Each relationship goes from one node either to another node or the same node;
therefore, it has a direction and involves either only two nodes or only one
Both nodes and relationships can have properties, and each property has a name
and a value
Before Neo4j introduced Cypher as a preferred query, utilizing Neo4j in a real-world
project was difficult compared to a traditional relational database. In particular, querying
the database was a nightmare, and executing a complex query required the user to write
an object, thereby performing a graph traversal. Roughly speaking, a traversal is an
operation that specifies how to traverse a graph and what to do with the nodes and
relationships found during the visit. Though it is very powerful, it works in a very
procedural way (through callbacks), so its readability is poor and any change to the query
means modifying the code and building it. Cypher, instead, provides a declarative syntax,
which is readable and powerful, and a rich set of graph patterns that can be recognized in
the graph. Thus, with Cypher, you can write (and read) queries much more easily and be
productive from the beginning. This book will guide you through learning this language
from the ground up, and each topic will be explained with a real-world example.
What This Book Covers
Chapter 1, Querying Neo4j Effectively with Pattern Matching, describes the basic clauses
and patterns to perform read-only queries with Cypher.
Chapter 2, Filter, Aggregate, and Combine Results, describes clauses and tips that can be
used with patterns to elaborate results that come from pattern matching.
Chapter 3, Manipulating the Database, covers the write clauses, which are needed to
modify a graph.
Chapter 4, Improving Performance, talks about tools and practices to improve
performances of queries.
Chapter 5, Migrating from SQL, explains how to migrate a database to Neo4j from the
ground up through an example.
Appendix, Operators and Functions, describes Cypher operators and functions in detail.


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
In the previous chapters, you learned how to query and work with data stored
in a database. Yet, you don't know how to modify the database because, as you
are going to see, the processes of creating, updating, and deleting data are topics
strictly related to all the notions we have learned so far. In this chapter, we will
learn the following:
How to use Neo4j Browser to prototype and test your queries quickly
The syntax and usage of the CREATE clause
How to merge data with an existing database
How to delete data from a database
Using Neo4j Browser
Neo4j Browser is a very useful tool that is distributed with Neo4j. It's a web-based
shell client, which we can use to interact in real time with a Neo4j Server database
without conguring or programming anything other than Cypher. Here, no Java
code is needed. The purpose of Neo4j Browser is to provide an easy interface for
prototyping databases and testing queries. It can be accessed by following these steps:
1. Start Neo4j Community. You can download the latest ver sion from the Neo4j
download page at http://www.neo4j.org/download.
2. Choose a database location path and click on the Start button.


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 58 ]
3. Wait a few seconds while the database is created.
4. Click on the link that appears in the Status panel (for example,
http://localhost:7474/) to open Neo4j Browser.
Now you should see your preferred browser open the web page as shown in the
following screenshot:


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 59 ]
At the top of the page, there is the shell prompt. Here, you can write the Cypher
queries you want to be executed in real time against the database. The results will be
shown as output in the panel below. For example, let's take a look at the content of
the database by writing the following query:
MATCH (n) RETURN n
The panel below shifts and shows the query result. On the right-hand side of the
result panel, we have two buttons: one that shows the graph visualization and one
that shows the grid visualization:
As we saw in the rst chapter, Neo4j Server supports a REST API. So, if you're going
to use a Neo4j database server, you'll use a REST API to interact with it. If you're
going to interact with the database server using a programming language that is
supported by a driver of your choice, note that this driver (whether PHP, C#, Python,
or any other) will wrap REST calls to the Neo4j REST API. Although the REST API
is not strictly necessary if you aren't a driver developer, it can be really useful to
know that Neo4j Browser can also be used to test REST API calls when something
goes wrong with the driver. A REST API is a layer that runs on a database server that
actually performs queries/modications on databases. Java, C#, and other drivers just
abstract calls to this API, allowing programmers to write their applications without
worrying about the stack and the protocol used to communicate with the database.
For example, you can send the previous query to Neo4j with REST by typing the
following code in the prompt:
:POST /db/data/cypher { "query": "MATCH (n) RETURN n" }
Neo4j Browser returns the following result:
{
"columns": [
"n"
],
"data": []
}


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 60 ]
The database is empty so we get zero rows. It's time to ll the database with new
nodes and relationships.
Creating nodes and relationships
In this section, we will learn how to create nodes and relationships in our database.
Let's start with the simplest example, that is, creating our rst node. In the prompt,
just type the following:
CREATE ()
This command creates a node without properties or labels. It's equivalent to the
following Java code:
Node n = graphDb.createNode();
The result panel now shows the result as Created 1 node, returned 0 rows in 825 ms.
The preceding command returned zero rows because we had no RETURN clause. It
just created an anonymous node in the database. This node is not very useful as it
is, as it can be referenced only by an ID. However, the command lets us introduce
the CREATE clause. The CREATE clause takes one argument: the pattern that expresses
the nodes and relationships to be created. All the patterns we learned in Chapter 1,
Querying Neo4j Effectively with Pattern Matching, are supported here. Any variable
used in the expression is bounded to the newly created object so that it can be used
further in the query. For example, consider the following query:
CREATE (n)
RETURN n
By adding the RETURN clause, we get a result that consists of one row: the node
that was created earlier. Well, let's have a look at the CREATE clause in action with
common tasks.
Labels and properties
In this chapter, we'll model a social network. A user registered in the social network
is a node. Let's create a new one using the following query:
CREATE (u:User)
RETURN u


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 61 ]
If you switch to the tabular view, Neo4j Browser's reply will be Added 1 label,
created 1 node, returned 1 row in 155 ms. This reply can be described as follows:
The added label is User, which is added to the new node.
The created node is bounded to the variable u.
The variable u is returned to us. The graphical visualizer in the result panel
shows us a chart, as shown in the following screenshot:
The number inside the node is the node ID assigned by Neo4j. You will most likely
get another value when you run the previous query in your database.
Multiple labels
You can add as many labels as you need to a node by chaining them in a single
denition. The following query creates another user with two labels, User
and Inactive:
CREATE (u:User:Inactive)
RETURN u
The result is a new node.


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 62 ]
Properties
The nodes we have created so far have no properties. The CREATE clause supports
the creation of properties along with their nodes in a unique query. To do so, just
apply the patterns we introduced in Chapter 1, Querying Neo4j Effectively with Pattern
Matching, as shown in the following query:
CREATE (u:User {name: "John", surname: "Doe"})
RETURN u
If you copy and paste the preceding query in the prompt, you will see the result
panel showing a single node. When you click on the node, a window appears,
showing the properties of the node.
If you look at the preceding screenshot, you will see that the node in the middle
shows its name as the property value instead of its ID. The property value shown in
the node can be set using the style tab, which can be opened by clicking on the node
itself. This setting is just for visualization; it doesn't affect your database.


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 63 ]
Creating multiple patterns
By separating patterns with a comma, you get them to be treated separately resulting
into the creation of multiple patterns. The following query creates three users in a
single call:
CREATE (a:User {name: "Jane", surname: "Roe"}),
(b:User {name: "Carlos", surname: "Garcia"}),
(c:User {name: "Mei", surname: "Weng"})
No result is returned, since the RETURN clause is missing. If we add RETURN a,b,c
to the query, we will get three nodes in a single row, one per column. To get them
in a single column, add RETURN [a,b,c] at the end of the query. Note that you
won't see any effect in the graph visualization panel of Neo4j Browser, but the return
values are shown in a tabular view; clearly, they are crucial if you query the database
programmatically.
Creating relationships
If you want to create relationships along with nodes, it is easy; just use the
relationship pattern, as shown in the following query:
CREATE (:User {name: "Jack", surname: "Smith"})
-[:Sibling]->
(:User {name: "Mary", surname: "Smith"})
Neo4j Browser shows a log of all the operations that are executed. The result shown
is Added 2 labels, created 2 nodes, set 4 properties, created 1 relationship, returned
0 rows in 472 ms.
The previous example gives us the opportunity to discuss about the direction of a
relationship. In Neo4j, every relationship is directed, so you must specify a direction
once it is created. However, as the paths between nodes can be traversed in both
directions, the application has the responsibility to either ignore or consider the
direction of the querying path. In fact, as mentioned in Chapter 1, Querying Neo4j
Effectively with Pattern Matching, you can query a relation and ignore the direction.
The following query is an example of this:
MATCH (a) -[r:Sibling]-(b)
RETURN a,r,b
Note that ignoring the direction of queries has performance implications on large
datasets, especially if used in conjunction with variable length paths. We will see
this in detail in the next chapter.


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 64 ]
Creating full paths
Using a path pattern, you can create a full path in Neo4j. The following query
illustrates how this is done:
CREATE p = (jr:User {name: "Jack", surname: "Roe"})
-[:Sibling]->
(:User {name: "Mary", surname: "Roe"})
-[:Friend]->
(:User {name: "Jane", surname: "Jones"})
-[:Coworker {company: "Acme Inc."}]
->(jr)
RETURN p
The preceding query creates three nodes and then three relationships among them:
Sibling, Friend, and Coworker. The latter has the company name as its property,
and the end node is the rst node of the path (the user Jack Roe). Look at the way
we have referenced a node that was specied previously by putting a variable in
the node's denition (jr:User {name: "Jack", surname: "Roe"}) and using the
variable jr afterwards.
As the query returns the path, Neo4j Browser shows a graph of the whole path, as
shown in the following screenshot:


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 65 ]
Creating relationships between existing nodes
using read-and-write queries
You can use the CREATE clause in conjunction with the MATCH statement to add
relationships between existing nodes. For example, the following query creates
a relationship between two nodes:
MATCH (a:User {name: "Jack", surname: "Roe"}),
(b:User {name: "Jack", surname: "Smith"})
CREATE (a) -[r:Knows]-> (b)
RETURN a,r,b
The MATCH statement matches the user nodes Jack Roe and Jack Smith with
the variables a and b; then, the CREATE clause creates a relation of the type Knows
between them. Finally, both the user nodes and their new relationship are returned
as output.
Generally, this kind of read-and-write query has the following two parts:
The rst is the reading part. Here, we can use any reading function (START,
MATCH, OPTIONAL MATCH, and WITH).
The second part is writing, where we can use the CREATE command or other
writing clauses that we are going to learn in the next section.
Modifying existing data
The last query in the previous section creates a relationship between two nodes. If
we run that query twice, we will have two relations between those nodes. In most
cases, this redundancy is unnecessary and useless for us. Suppose our social network
was online and had a button called "Add Friend". In this scenario, if two users, say A
and B, click on this button at the same time to add each other as friends, the relation
would be doubled in the database. This is a waste of storage. In this context, we need
to check the database and create the relation only if it does not exist. This is why an
OPTIONAL MATCH clause is required to prevent double storage. This is illustrated in
the following query:
MATCH (a:User {name: "Jack", surname: "Roe"}),
(b:User {name: "Jack", surname: "Smith"})
OPTIONAL MATCH (a) -[r:Knows]- (b)
WITH a,r,b
WHERE r IS NULL
CREATE (a) -[rn:Knows]-> (b)
RETURN a,rn,b


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 66 ]
This query, rst of all, nds the users Jack Roe and Jack Smith in the database
(the MATCH clause), then checks whether they are connected through a relation of
the type Knows (the OPTIONAL MATCH clause). If not, (r IS NULL means that the
relation cannot be found) the CREATE command that follows will create a relationship
between the nodes. The WITH clause is necessary to apply the WHERE clause to the
whole query. If the WITH clause is not used, the WHERE clause is applied only to the
OPTIONAL MATCH clause.
If you run the preceding query after the query mentioned in the Creating relationships
between existing nodes using read-and-write queries section, you'll get no rows. This is
because the relationship is already created in the database. Clearly, this query isn't
easy to read or write and it's error-prone. For these reasons, Cypher provides us with
two keywords to deal with existing data.
Creating unique patterns
The complexity of the preceding query is due to the fact that we have to check
the nonexistence of a relationship before creating it. This is because we want that
relationship to be unique. Fortunately, Cypher provides us with a command that
wraps such a check and ensures that the pattern specied is unique in the database.
For example, we can rewrite the preceding query using the CREATE UNIQUE
command, as shown in the following query:
MATCH (a:User {name: "Jack", surname: "Roe"}),
(b:User {name: "Jack", surname: "Smith"})
CREATE UNIQUE (a) -[rn:Knows]-> (b)
RETURN a,rn,b
Using the CREATE UNIQUE command in the preceding query saved us from writing
the entire OPTIONAL MATCH and WHERE clauses. My preferred motto is that the more
code you write, the more bugs you hide; here, the latter is the preferred choice.
However there are two important differences between the the preceding query and
the one in the previous section. They are as follows:
If the CREATE UNIQUE command nds the relationship multiple times in the
database, it will throw an error. For example, if two instances of the Knows
relationship exist between the users Jack Roe and Jack Smith, then the
query with the CREATE UNIQUE command will fail with an error, while the
query with the OPTIONAL MATCH command will succeed (it will not create the
relationship). Anyway, both the CREATE UNIQUE and the OPTIONAL MATCH
commands won't make any modications to the database. This difference is
not a disadvantage of the CREATE UNIQUE command, rather an advantage.
An error thrown by the query means that the database is corrupted as it has
multiple instances of a relationship (or any pattern) that should be unique.


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 67 ]
In the next chapter, we will learn how to enforce certain
assertions using constraints.
The query with the OPTIONAL MATCH command returns a row only if it
creates a new relationship. However, the query with the CREATE UNIQUE
command will return a result if it nds a relationship or creates a new one.
This can be a useful feature in some contexts; we can know the state
of certain paths in the database after the CREATE UNIQUE command is
executed without performing another read-only query.
Yet, the CREATE UNIQUE command can be even more useful. Suppose we don't know
if a user named Jack Smith has been created; if not we have to create and link it to
the user Jack Roe. Consider the following read-and-write query:
MATCH (a:User {name: "Jack", surname: "Roe"})
CREATE UNIQUE (a) -[rn:Knows]->
(b:User {name: "Jack", surname: "Smith"})
RETURN a,rn,b
First of all, it looks for the user Jack Roe in the database, binding it to the variable
a. If it cannot be found, the query will nish the execution and return zero rows.
Otherwise, it executes the CREATE UNIQUE command, and there are four possible
scenarios, which are listed as follows:
1. The full path already exists and it is unique; we have the user node Jack Roe
with exactly one relationship with the user node Jack Smith. In this case, the
existing nodes are bound to the variables a, rn, and b. Then, these variables
are returned as result.
2. Neither the Jack Smith node nor the relationship exists in the database.
In this case, the CREATE UNIQUE command creates the full path. The new
relation is bound to the variable rn, while the new node is bound to the
variable b.


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 68 ]
3. When there are multiple paths, the path (a)-[:Knows]-(b) exists multiple
times. For example, the Knows relationship exists multiple times between the
nodes. If this happens, a Neo.ClientError.Statement.ConstraintViolation
error is thrown because the CREATE UNIQUE command can't deal with
multiple patterns.
4. Both Jack Roe and Jack Smith exist in the database as nodes, but there
is no Knows relationship between them. As the matching follows the all-or-
none rule, the Cypher engine creates a new Jack Smith node and a new
relationship bound to the variable rn. This is due to the fact that the purpose
of the CREATE UNIQUE command is to ensure that a whole pattern is unique
in the graph and if the node already exists but not the relationship, we do not
have the whole pattern in the graph.
The last scenario could be a problem because we would have duplicated a user in the
database. We can resolve this issue using the MERGE clause, which is discussed later
in the chapter.


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 69 ]
To summarize, the following diagram shows how the CREATE UNIQUE clause works:
<Path> found in
database?
CREATE UNIQUE
<path>
The full path is created in the
database, even in the case of
a partial match
Variables are bound
to the path found
Multiple times?
ConstraintViolation
error
yes
no
no
yes
Complex patterns
Just as the MATCH and the CREATE clauses, you can join simple patterns to describe a
complex one. Consider the following query:
MATCH (a:User {name: "Jack", surname: "Roe"})
CREATE UNIQUE (a) -[kn:Knows]->
(b:User {name: "Jack", surname: "Smith"}),
(a) -[cw:Colleague]-> (b)
This query creates two relationships between two users. Only the relationships not
found in the database are created. If you launch this query after the query from the
previous section, you'll get a the message Created 1 relationship, returned 0 rows
in 307 ms.


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 70 ]
In fact, the relationship Knows and the user Jack Smith were already in the database,
while the Colleague relationship was missing. If all of them exist, this query makes
no modications to the graph. The second time you launch this query, you'll get the
result Returned 0 rows in 229 ms, which means that neither relationships nor nodes
were created.
Note that the CREATE UNIQUE command looks for a path that exactly matches the
pattern. So, for example, the following query won't match either the existing user
node or the existing relationship. Instead, it will create a new relationship and a
new node.
MATCH (a:User {name: "Jack", surname: "Roe"})
CREATE UNIQUE (a) -[rn:Knows {friend: true}]->
(b:User {name: "Jack",surname: "Smith", age:34})
In fact, we haven't set the age property to the user Jack Smith in our database.
However, this could return weird results in some cases (as the preceding example).
How to update the user node without creating a new user if a new property is found
in the pattern? Again, this issue can be solved using the MERGE clause.
Setting properties and labels
First of all, we need to know how to set the property of an existing node. The SET
clause is just the ticket. Let's start with an example. Consider the following query:
MATCH (a:User {name: "Jack", surname: "Roe"})
SET a.age = 34
RETURN a
This query takes the user node Jack Roe and sets the age property for it; then,
it returns the updated node. Neo4j Browser shows the result as Set 1 property,
returned 1 row in 478 ms.
Note that the SET clause here works on the nodes found using the MATCH clause. This
means that we can set a property on a huge list of nodes if we don't write the MATCH
clause carefully. The following query sets the city property on all the nodes with
the surname property Roe:
MATCH (a:User {surname: "Roe"})
SET a.place = "London"
RETURN a
In our database, this query updates three nodes: Jane, Jack, and Mary Roe. Neo4j
Browser shows the result as Set 3 properties, returned 3 rows in 85 ms.


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 71 ]
Again, you can change several assignment expressions to make more property
changes at the same time. For example, to set the country as well, the query will
be as follows:
MATCH (a:User {surname: "Roe"})
SET a.place="London", a.country="UK"
RETURN a
The syntax to set a property to a relationship is the same, as shown in the
following query:
MATCH (:User{surname: "Roe"})-[r:Knows]-()
SET r.friend = true
This query nds all the Knows relationships of users with the surname property Roe
and sets the property friend to true for all of them.
Cloning a node
The SET clause can also be used to copy all the properties of a node to another. For
example, to copy the node x to the node y, use the following query:
SET y = x
Note that all of the destination node's properties will be removed before the
node is copied.
Copying a node is useful when a node needs cloning. For example, in our social
network, there could be a function to create an alias identity; the user could start
cloning his/her own identity and then modify the new one. This command can be
coded as shown in the following query:
MATCH (a:User {name: "Jack", surname: "Roe"})
CREATE (b:Alias)-[:AliasOf]->(a)
WITH a,b
SET b = a
RETURN a,b
This query, once it nds the user node to clone, creates a new node with labels Alias
and User and have a relationship with the source node of the type AliasOf. Then, it
copies all the properties from the source node to it and nally returns the node. The
command SET b = a doesn't affect the labels of the node b or its relationships; it just
copies the properties.


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 72 ]
Adding labels to nodes
The SET clause can also be used to add one or more labels to a node, as shown in the
following query:
MERGE (b:User {name: "Jack", surname: "Smith"})
SET b:Inactive
The only difference is that we need to use the label separator instead of the property
assignment. To chain more labels, just append them with the separator, as shown in
the following query:
MERGE (b:User {name: "Jack", surname: "Smith"})
SET b:Inactive:NewUser:MustConfirmEmail
Merging matched patterns
The MERGE clause is a new feature of Cypher, introduced by Neo4j 2.0. The features
of the MERGE clause are similar to those of the CREATE UNIQUE command. It checks
whether a pattern exists in the graph. If not, it creates the whole pattern; otherwise,
it matches it. The main difference is that the pattern doesn't have to be unique. The
other differences are as follows:
The MERGE clause supports the single node pattern
The MERGE clause allows users to specify what to do when the pattern is
matched and what to do when the pattern is being created
In an earlier section, we saw two issues with the CREATE UNIQUE command. They
are as follows:
1. How to create a new node if the pattern does not match, but match the
existing node if the node exists?
2. How to set the variables when merging nodes and relationships?
To answer the rst question, let's recall the second query from the Creating unique
patterns section:
MATCH (a:User {name: "Jack", surname: "Roe"})
CREATE UNIQUE (a) -[rn:Knows]->
(b:User {name: "Jack", surname: "Smith"})


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 73 ]
Now, if the intent of this query is to match an existing Jack Smith user node before
creating a relationship to it, it will fail. This is because if the relationship does not
exist, a new Jack Smith node will be created again. We can take advantage of the
single node pattern supported by the MERGE clause and write the following query:
MATCH (a:User {name: "Jack", surname: "Roe"})
MERGE (b:User {name: "Jack", surname: "Smith"})
WITH a,b
MERGE (a) -[rn:Knows]-> (b)
RETURN a,rn,b
To accomplish our goal, we had to split the query in two parts using the WITH clause.
The rst step is to nd the Jack Roe user node in the graph with the MATCH clause.
Then, the rst MERGE clause ensures that a node with exactly two propertiesthe
name Jack and surname Smithexists in the database. In the latter part of the query,
the focus is on the relationship Knows between the two nodes involved; the second
MERGE clause ensures that the relationship exists after the execution. What happens if
the Jack Smith user exists twice in the database and the nodes are already related?
The MERGE clause wouldn't fail; it would succeed, returning two rows.
In the next chapter, we will learn how to create constraints in the database
to ensure that it won't ever create nodes with the same property value.
Now, about the second problem of how to set properties during merging operations,
the MERGE clause supports two interesting features. They are as follows:
ON MATCH SET: This clause is used to set one or more properties or labels on
the matched nodes
ON CREATE SET: This clause is used to set one or more properties or labels on
the new nodes
For example, suppose that we want to set the Jack Smith user node's place
property to London only if we are creating it, then the following query can be used:
MERGE (b:User {name: "Jack", surname: "Smith"})
ON CREATE SET b.place = "London"
If at the same time, we want to set his age property to 34 only if the user already
exists, then the following query can be used:
MERGE (b:User {name: "Jack", surname: "Smith"})
ON CREATE SET b.place = "London"
ON MATCH SET b.age = 34


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 74 ]
Clearly, when we want to set a property in both cases, you can just append a SET
clause to a MERGE clause, as shown in the following query:
MERGE (b:User {name: "Jack", surname: "Smith"})
SET b.age = 34
Once you learn how to use the MERGE clause and the CREATE UNIQUE
command, you may wonder when to use either of these. As a general
rule, when in doubt, you should use the CREATE UNIQUE command
when the pattern is conceived as a whole path that must be unique in
the graph.
Idempotent queries
In certain applications, such as websites with several client types, parallel
applications, and so on, some commands happen to be sent multiple times from
external layers to the backend. This is due to a number of reasons, for example,
user interfaces are not up to date, users can send a command multiple times,
synchronization issues, and so on. In these cases, you could get the command to be
executed multiple times; clearly you don't want the second or the nth execution to
have an effect on the database. Commands that are executed once but have no effect
when executed multiple times again on the same graph later are idempotent. Both
MERGE and SET clauses allow you to write idempotent commands that nowadays
are very useful in these growing contexts.
Deleting data
Cypher provides two clauses to delete data. They are as follows:
REMOVE: This clause is used to remove labels and properties from nodes
or relationships
DELETE: This clause is used to delete nodes and relationships from
the database
Removing labels
To remove a label from a node, you must use the REMOVE clause. The syntax is similar
to the one for the SET clause, as shown in the following query:
MERGE (b:User {name: "Jack", surname: "Smith"})
REMOVE b:Inactive:MustConfirmEmail


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 75 ]
This query removes the labels Inactive and MustConfirmEmail from the node that
was matched. Note that we have chained the labels using the colon separator. If the
node already doesn't have one or all of the labels specied, this query will not fail; it
will only remove the labels it can remove.
Removing properties
The REMOVE clause is the opposite of the SET clause. It can be used to remove a
property from a node, as shown in the following query:
MERGE (b:User {name: "Jack", surname: "Smith"})
REMOVE b.age
Anyway, as Neo4j does not store NULL properties, the same result can be achieved by
setting the property to NULL, as shown in the following query:
MERGE (b:User {name: "Jack", surname: "Smith"})
SET b.age = NULL
The preceding query can be used effectively when working with parameters. In fact,
you can write the query as following:
MERGE (b:User {name: {name}, surname: {surname}})
SET b.age = {age}
This query can be used to both set the age property of a user and remove the
age parameter from the node. Again, all operations with the REMOVE and SET
clauses are idempotent, so you don't need to worry if the properties exist before
you remove them.
Deleting nodes and relations
If you want to delete a node, use the DELETE clause, as shown in the following query:
MATCH (c:User {name: "Mei", surname: "Weng"})
DELETE c
This query looks for a node with the given name and surname using the MATCH
clause and then tries to delete it.
Three important points about the preceding query are as follows:
It is idempotent. If no node is found, the query won't fail; it just won't delete
anything.
Properties are deleted with the node. You do not need to remove all the
properties before deleting the node.


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 76 ]
On the other hand, if the node to delete has at least one relationship with
another node, the query will fail and raise an exception, which can be seen
in the following screenshot:
Therefore, before deleting a node, you must be sure that it is not involved in any
relationship. If it is, then how to delete a relationship? The syntax is the same; you
just have to use the right variable, as shown in the following query:
MATCH (:User {name: "Mei", surname: "Weng"}) -[r]- ()
DELETE r
This query deletes any relationship that involves the user that was matched. So, to
delete a node along with all its relationships, should we perform two queries? Not
at all, we just need the following query:
MATCH (c:User {name: "Jack", surname: "Smith"})
OPTIONAL MATCH (c)-[r]- ()
DELETE r, c
This query deletes the node and any of its relationships. It, rst of all, nds the node,
then it matches it with any existing relationship. The OPTIONAL MATCH clause is needed
because if we use a simple MATCH clause, the query won't delete a node with zero
relations. Finally, the DELETE clause causes the relations and the node to be deleted.
Clearing the whole database
Generalizing the preceding query, we can clear the whole database, deleting all the
nodes and relationships, just by changing the rst MATCH clause to take all user nodes
as argument, as shown in the following query:
MATCH (a)
OPTIONAL MATCH (a)-[r]-()
DELETE a, r


For More Information:
www.packtpub.com/learning-cypher/book

Chapter 3
[ 77 ]
We can also do this by using the START clause on all the nodes. We get the same
result. The query is as follows:
START a = node(*)
OPTIONAL MATCH (a)-[r]-()
DELETE a, r
Use it carefully because it will delete all of the data from the graph.
Loops
Due to the nature of the Cypher queries, usually, you won't need something like a
loop used in other programming languages. In fact, as you have probably already
realized, the general structure of a Cypher query is formed by three phases and
each one of these is optional. The phases are as follows:
Read: This is the phase where you read data from the graph using the START,
MATCH, or OPTIONAL MATCH clauses.
Write: This is the phase where you modify the graph using CREATE, MERGE,
SET, and all the other clauses we learned in this chapter.
Return: This is the phase where you choose what to return to the caller by
using the RETURN clause. This part can be replaced by a WITH clause and then
the query can start again from the read phase.
The read phase is important because the write phase will be executed for every item
found in the read phase. For example, consider the following query:
MATCH (a:User {surname: "Roe"})
SET a.place = "London"
RETURN a
In this query, the SET command will be executed once for each node found with the
MATCH clause. Consequently, you usually won't need an explicit for-loop statement.
Working with collections
The unique case when you need to iterate is when you work with collections. In
the previous chapter, we saw that Cypher can use different kinds of collections:
collections of nodes, relationships, and properties. Sometimes, you may need to
iterate on a collection and perform some write operation consequently. There is a
clause for this purpose, and it is the FOREACH clause. Its syntax is akin to the syntax
of collection clauses we learned in the previous chapters. The syntax is as follows:
FOREACH (variable IN collection | command)


For More Information:
www.packtpub.com/learning-cypher/book

Manipulating the Database
[ 78 ]
Let's see it in action with an example. Suppose that in our social network, you want a
function that traverses the shortest path from one user to another and creates a new
relationship of the type MaybeKnows between each node visited and the rst user.
Does it sound difcult to achieve this with a single query? No, it can be done with just
two clauses: a MATCH clause and a FOREACH clause, as shown in the following query:
MATCH p=shortestPath(
(a:User {name: "Mary", surname: "Smith"})-[*]-
(b:User {name: "Jane", surname: "Jones"}) )
FOREACH (n IN tail(nodes(p)) |
CREATE UNIQUE (n)-[:MaybeKnows]->(a))
In the rst step, this query computes the shortest path between the two nodes, using
the pattern we learned in Chapter 1, Querying Neo4j Effectively with Pattern Matching.
Then, there is an iteration over all the nodes, except for the rst node; for each node, a
unique relation is created between the rst node of the path (Mary Smith) and itself.
Now, let's take a look at the content of the FOREACH clause, that is, tail(nodes(p)).
The function nodes extracts all the nodes of the path, while the tail function returns
all the items except for the rst of a collection. In this case, we use the tail function
because we don't want Cypher to create a relation between the user node Mary
Smith and itself. In fact, Neo4j allows you to create self-loops. Self-loops are relations
between a node and itself. Note that using self-loops is perfectly right in some
contexts but not in this case. So, we have avoided it by using the tail function.
Summary
In this chapter, you learned how to use Neo4j Browser. Thanks to this very useful
testing and prototyping tool provided by Neo4j, you have learned a lot of new
clauses needed to modify the graph, such as CREATE, CREATE UNIQUE, SET, MERGE,
REMOVE, and DELETE.
Finally, you learned how to use the FOREACH clause to traverse a collection to
perform write operations.
In the next chapter, we will examine in depth how to improve the performance
of our Cypher queries, and how to enforce important assertions about our graph
according to the peculiarities of the domain model.


For More Information:
www.packtpub.com/learning-cypher/book

Where to buy this book
You can buy Learning Cypher from the Packt Publishing website:
http://www.packtpub.com/learning-cypher/book.
Free shipping to the US, UK, Europe and selected Asian countries. For more information, please
read our shipping policy.
Alternatively, you can buy the book from Amazon, BN.com, Computer Manuals and
most internet book retailers.




















www.PacktPub.com


For More Information:
www.packtpub.com/learning-cypher/book

You might also like