0% found this document useful (0 votes)
110 views1,044 pages

Neo4j Cypher Manual 5

Cypher 5 manual

Uploaded by

Dmitry Nemtsoff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views1,044 pages

Neo4j Cypher Manual 5

Cypher 5 manual

Uploaded by

Dmitry Nemtsoff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1044

Neo4j 5 Cypher Manual

Table of Contents
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Cypher and Neo4j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Cypher and Aura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Queries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Core concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Basic queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Cypher expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Conditional expressions (CASE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Reading clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Projecting clauses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Reading sub-clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Writing clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Reading/Writing clauses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Subquery clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Set operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Multiple graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Importing data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Listing functions and procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Configuration Commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Transaction Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Reading hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Index and constraint clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Administration clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Clause composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
CALL procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
CREATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
DELETE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
FINISH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
FOREACH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
LIMIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
LOAD CSV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
MATCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
MERGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
OPTIONAL MATCH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
ORDER BY. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
REMOVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
RETURN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
SET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
SHOW FUNCTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
SHOW PROCEDURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
SHOW SETTINGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
SKIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
UNION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
UNWIND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
USE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
WHERE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
WITH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Subqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
CALL subqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
CALL subqueries in transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
COLLECT subqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
COUNT subqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
EXISTS subqueries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Fixed length patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Variable length patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Shortest paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Non-linear patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Syntax and semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Values and types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Property, structural, and constructed values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Temporal values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
Spatial values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
Working with null. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Lists. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Casting data values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Type predicate expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Aggregating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Database functions Label—new 5.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
GenAI functions Label—new 5.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
Graph functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
List functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
LOAD CSV functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Logarithmic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Numeric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Predicate functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Scalar functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
String functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Spatial functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Temporal duration functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Temporal instant types functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
User-defined functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Vector functions Label—new 5.18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Aggregating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Database functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Graph functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
List functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
LOAD CSV functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Mathematical functions - logarithmic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
Mathematical functions - numeric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Mathematical functions - trigonometric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Predicate functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
Scalar functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Spatial functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
String functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Temporal functions - duration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
Temporal functions - instant types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
User-defined functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
Vector functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
GenAI integrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
Example graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
Generate a single embedding and store it . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
Generating a batch of embeddings and store them . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
GenAI providers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
Search-performance indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
Semantic indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
Constraints Include . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Create, show, and drop constraints Include . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
Execution plans and query tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679
Note on PROFILE and EXPLAIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679
Understanding execution plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679
Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
Cypher runtimes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855
Query tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 870
Query caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877
Configure caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877
Unifying query caches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877
Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 879
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 880
Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 880
Naming rules and recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 882
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883
Reserved keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884
Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887
Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 891
Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 909
Deprecations, additions, and compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911
Neo4j 5.25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911
Neo4j 5.24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911
Neo4j 5.23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913
Neo4j 5.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915
Neo4j 5.20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 917
Neo4j 5.19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 918
Neo4j 5.18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 918
Neo4j 5.17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 919
Neo4j 5.16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921
Neo4j 5.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 922
Neo4j 5.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924
Neo4j 5.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
Neo4j 5.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926
Neo4j 5.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926
Neo4j 5.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927
Neo4j 5.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 929
Neo4j 5.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 931
Neo4j 5.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 932
Neo4j 5.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934
Neo4j 5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934
Neo4j 5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935
Neo4j 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 936
Neo4j 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 936
Neo4j 5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937
Neo4j 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 950
Neo4j 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957
Neo4j 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 963
Neo4j 4.1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 966
Neo4j 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 966
Neo4j 4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 968
Neo4j 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 973
Neo4j 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 973
Neo4j 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974
Neo4j 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975
Neo4j 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 976
Neo4j 3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 976
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977
Appendix A: Cypher styleguide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977
GQL conformance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 984
Tutorials and extended examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005
Welcome to the Neo4j Cypher Manual.

Cypher is Neo4j’s declarative query language, allowing users to unlock the full potential of property graph
databases.

The Cypher Manual aims to be as instructive as possible to readers from a variety of backgrounds and
professions, such as developers, administrators, and academic researchers.

If you are new to Cypher and Neo4j, you can visit the Getting Started Guide → Introduction to Cypher
chapter. Additionally, Neo4j GraphAcademy has a variety of free courses tailored for all levels of
experience.

For a reference of all available Cypher features, see the Cypher Cheat Sheet.

For a downloadable PDF version of the Cypher Manual, visit the Neo4j documentation archive.

This introduction will cover the following topics:

• Overview

• Cypher and Neo4j

• Cypher and Aura

License: Creative Commons 4.0

Overview
This section provides an overview of Cypher and a brief discussion of how Cypher differs to SQL.

What is Cypher?
Cypher is Neo4j’s declarative graph query language. It was created in 2011 by Neo4j engineers as an
SQL-equivalent language for graph databases. Similar to SQL, Cypher lets users focus on what to retrieve
from graph, rather than how to retrieve it. As such, Cypher enables users to realize the full potential of
their property graph databases by allowing for efficient and expressive queries that reveal previously
unknown data connections and clusters.

Cypher provides a visual way of matching patterns and relationships. It relies on the following ascii-art
type of syntax: (nodes)-[:CONNECT_TO]→(otherNodes). Rounded brackets are used for circular nodes, and
-[:ARROWS]→ for relationships. Writing a query is effectively like drawing a pattern through the data in the
graph. In other words, entities such as nodes and their relationships are visually built into queries. This
makes Cypher a highly intuitive language to both read and write.

Cypher and SQL: key differences


Cypher and SQL are similar in many ways. For example, they share many of the same keywords, such as
WHERE and ORDER BY. However, there are some important differences between the two:

1
Cypher is schema-flexible
While it is both possible and advised to enforce partial schemas using indexes and constraints, Cypher
and Neo4j offers a greater degree of schema-flexibility than SQL and a relational database. More
specifically, nodes and relationships in a Neo4j database do not have to have a specific property set to
them because other nodes or relationships in the same graph have that property (unless there is an
property existence constraint created on that specific property). This means that users are not required
to use a fixed schema to represent data and that they can add new attributes and relationships as their
graphs evolve.

Query order
SQL queries begin with what a user wants to return, whereas Cypher queries end with the return
clause. For example, consider the following two queries (both searching a database for titles of movies
with a rating of greater than 7), the first written with SQL and the second with Cypher:

SELECT movie.name
FROM movie
WHERE movie.rating > 7

MATCH (movie:Movie)
WHERE movie.rating > 7
RETURN movie.title

Cypher queries are more concise


Due to its intuitive, whiteboard-like method of constructing clauses, Cypher queries are often more
concise than their equivalent SQL queries. For example, consider the following two queries (both
searching a database for the names of the actors in the movie The Matrix), the first written with SQL
and the second with Cypher:

SELECT actors.name
FROM actors
LEFT JOIN acted_in ON acted_in.actor_id = actors.id
LEFT JOIN movies ON movies.id = acted_in.movie_id
WHERE movies.title = "The Matrix"

MATCH (actor:Actor)-[:ACTED_IN]->(movie:Movie {title: 'The Matrix'})


RETURN actor.name

Cypher and APOC


Neo4j supports the APOC (Awesome Procedures on Cypher) Core library. The APOC Core library provides
access to user-defined procedures and functions which extend the use of the Cypher query language into
areas such as data integration, graph algorithms, and data conversion.

For more details, visit the APOC Core page.

Cypher and Neo4j


This section discusses aspects of Neo4j that are important to consider when using Cypher.

2
Cypher and the different editions of Neo4j
Neo4j consists of two editions: a commercial Enterprise Edition, and a Community Edition.

Cypher works almost identically between the two editions, but there are key areas in which they differ:

Feature Enterprise Edition Community Edition

Multi-database Any number of user databases. Only system and one user
database.

Role-based security User, role, and privilege Multi-user management. All users
management for flexible sub- have full access rights.
graph access control.

Constraints All constraints: node and Only node and relationship


relationship property existence property uniqueness constraints.
constraints, node and relationship
property type constraints, node
and relationship property
uniqueness constraints, node and
relationship key constraints.

Key Neo4j terminology


Cypher queries are executed against a Neo4j database, but normally apply to specific graphs. It is
important to understand the meaning of these terms and exactly when a graph is not a database.

DBMS
A Neo4j Database Management System is capable of containing and managing multiple graphs
contained in databases. Client applications will connect to the DBMS and open sessions against it. A
client session provides access to any graph in the DBMS.

Graph
Refers to a data model within a database. Normally there is only one graph within each database, and
many administrative commands that refer to a specific graph do so using the database name. Cypher
queries executed in a session may declare which graph they apply to, or use a default, given by the
session. Composite databases can contain multiple graphs, by means of aliases to other databases.
Queries submitted to composite databases may refer to multiple graphs within the same query. For
more information, see Operations manual → Composite databases.

Database
A database is a storage and retrieval mechanism for collecting data in a defined space on disk and in
memory.

3
Built-in databases in Neo4j
All Neo4j servers contain a built-in database called system, which behaves differently than all other
databases. The system database stores system data and you can not perform graph queries against it.

A fresh installation of Neo4j includes two databases:

• system - the system database described above, containing meta-data on the DBMS and security
configuration.

• neo4j - the default database, named using the config option dbms.default_database=neo4j.

For more information about the system database, see the sections on Access control.

Query considerations
Most of the time Cypher queries are reading or updating queries, which are run against a graph. There are
also administrative commands that apply to a database, or to the entire DBMS. Administrative commands
cannot be run in a session connected to a normal user database, but instead need to be run within a
session connected to the system database. Administrative commands execute on the system database. If
an administrative command is submitted to a user database, it is rerouted to the system database.

Cypher and Neo4j transactions


All Cypher queries run within transactions. Modifications done by updating queries are held in memory by
the transaction until it is committed, at which point the changes are persisted to disk and become visible to
other transactions. If an error occurs - either during query evaluation, such as division by zero, or during
commit, such as constraint violations - the transaction is automatically rolled back, and no changes are
persisted in the graph.

In short, an updating query always either fully succeeds or does not succeed at all.

A query that makes a large number of updates consequently uses large amounts of

 memory since the transaction holds changes in memory. For memory configuration in
Neo4j, see the Neo4j Operations Manual → Memory configuration.

Explicit and implicit transactions


Transactions in Neo4j can be either explicit or implicit.

Explicit Implicit

Opened by the user. Opened automatically.

Can execute multiple Cypher queries in sequence. Can execute a single Cypher query.

Committed, or rolled back, by the user. Committed automatically when a transactions finishes
successfully.

Queries that start separate transactions themselves, such as queries using CALL { ... } IN TRANSACTIONS,
are only allowed in implicit mode. Explicit transactions cannot be managed directly from queries, they must

4
be managed via APIs or tools.

For examples of the API, or the commands used to start and commit transactions, refer to the API or tool-
specific documentation:

• For information on using transactions with a Neo4j driver, see The session API in the Neo4j Driver
manuals.

• For information on using transactions over the HTTP API, see the HTTP API documentation → Using
the HTTP API.

• For information on using transactions within the embedded Core API, see the Java Reference →
Executing Cypher queries from Java.

• For information on using transactions within the Neo4j Browser or Cypher-shell, see the Cypher-shell
documentation.

When writing procedures or using Neo4j embedded, remember that all iterators returned from an
execution result should be either fully exhausted or closed. This ensures that the resources bound to them
are properly released.

DBMS transactions
Beginning a transaction while connected to a DBMS will start a DBMS-level transaction. A DBMS-level
transaction is a container for database transactions.

A database transaction is started when the first query to a specific database is issued. Database
transactions opened inside a DBMS-level transaction are committed or rolled back when the DBMS-level
transaction is committed or rolled back.

DBMS transactions have the following limitations:

• Only one database can be written to in a DBMS transaction.

• Cypher operations fall into the following main categories:


◦ Operations on graphs.

◦ Schema commands.

◦ Administration commands.

It is not possible to combine any of these workloads in a single DBMS transaction.

ACID compliance
Neo4j is fully ACID compliant. This means that:

• Atomicity - If a part of a transaction fails, the database state is left unchanged.

• Consistency — Every transaction leaves the database in a consistent state.

• Isolation — During a transaction, modified data cannot be accessed by other operations.

• Durability — The DBMS can always recover the results of a committed transaction.

5
Cypher and Aura
This page provides a brief overview of Neo4j Aura and its relationship to Cypher.

What is Aura?
Aura is Neo4j’s fully managed cloud service. It consists of AuraDB and AuraDS. AuraDB is a graph
database service for developers building intelligent applications, and AuraDS is a Graph Data Science
(GDS) service for data scientists building predictive models and analytics workflows.

AuraDB is available on the following tiers:

• AuraDB Free

• AuraDB Professional

• AuraDB Business Critical

• AuraDB Virtual Dedicated Cloud

For more information, see Aura docs - Neo4j AuraDB overview.

AuraDS is available on the following tiers:

• Graph Data Science Community

• Graph Data Science Enterprise

• AuraDS Professional

• AuraDS Enterprise

For more information, see Aura docs - Neo4j AuraDS overview.

Using Cypher on Aura


Most Cypher features are available across all tiers of Aura. However, certain features are not supported in
Aura instances. For example, it is not possible to create, alter, or drop databases using Aura, nor is it
possible to alter or drop servers.

Additionally, some Cypher features are exclusive to AuraDB Business Critical and AuraDB Virtual
Dedicated Cloud tiers. These primarily fall under database administration and role-based access control
capabilities. For more information, see the Operations Manual → Authentication and authorization.

Aura and the Cypher Cheat Sheet


Each different tier of Aura has a customized version of the Cypher Cheat Sheet which only shows the
features of Cypher available for the chosen tier.

The Cypher Cheat Sheet can be accessed here. You can select your desired Aura tier and Neo4j version by
using the dropdown menus provided. Note that the default tier is AuraDB Virtual Dedicated Cloud.

6
Queries
This section provides a brief overview of the core concepts of a Cypher query (nodes, relationships, and
paths), and examples of how to query a Neo4j graph database. It also contains information about Cypher
expressions.

• Core concepts

• Basic queries

• Cypher expressions

• Conditional expressions (CASE)

Core concepts
https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/cypher-
fundamentals/ad.adoc

Fundamentally, a Neo4j graph database consists of three core entities: nodes, relationships, and paths.
Cypher queries are constructed to either match or create these entities in a graph. Having a basic
understanding of what nodes, relationships, and paths are in a graph database is therefore crucial in order
to construct Cypher queries.

The below examples use the MATCH and RETURN clauses to find and return the desired graph patterns.
To learn more about these and many other Cypher clauses, see the section on Clauses.

Nodes
The data entities in a Neo4j graph database are called nodes. Nodes are referred to in Cypher using
parentheses ().

MATCH (n:Person {name:'Anna'})


RETURN n.born AS birthYear

In the above example, the node includes the following:

• A Person label. Labels are like tags, and are used to query the database for specific nodes. A node may
have multiple labels, for example Person and Actor.

• A name property set to Anna. Properties are defined within curly braces, {}, and are used to provide
nodes with specific information, which can also be queried for and further improve the ability to
pinpoint data.

• A variable, n. Variables allow referencing specified nodes in subsequent clauses.

In this example, the first MATCH clause finds all Person nodes in the graph with the name property set to
Anna, and binds them to the variable n. The variable n is then passed along to the subsequent RETURN
clause, which returns the value of a different property (born) belonging to the same node.

7
Relationships
Nodes in a graph can be connected with relationships. A relationship must have a start node, an end node,
and exactly one type. Relationships are represented in Cypher with arrows (e.g. -->) indicating the
direction of a relationship.

MATCH (:Person {name: 'Anna'})-[r:KNOWS WHERE r.since < 2020]->(friend:Person)


RETURN count(r) As numberOfFriends

Unlike nodes, information within a relationship pattern must be enclosed by square brackets. The query
example above matches for relationships of type KNOWS and with the property since set to less than 2020.
The query also requires the relationships to go from a Person node named Anna to any other Person nodes,
referred to as friend. The count() function is used in the RETURN clause to count all the relationships bound
by the r variable in the preceding MATCH clause (i.e. how many friends Anna has known since before 2020).

Note that while nodes can have several labels, relationships can only have one type.

Paths
Paths in a graph consist of connected nodes and relationships. Exploring these paths sits at the very core
of Cypher.

MATCH (n:Person {name: 'Anna'})-[:KNOWS]-{1,5}(friend:Person WHERE n.born < friend.born)


RETURN DISTINCT friend.name AS olderConnections

This example uses a quantified relationship to find all paths up to 5 hops away, traversing only
relationships of type KNOWS from the start node Anna to other older Person nodes (as defined by the
WHERE clause). The DISTINCT operator is used to ensure that the RETURN clause only returns unique
nodes.

Paths can also be assigned variables. For example, the below query binds a whole path pattern, which
matches the SHORTEST path from Anna to another Person node in the graph with a nationality property set
to Canadian. In this case, the RETURN clause returns the full path between the two nodes.

MATCH p = SHORTEST 1 (:Person {name: 'Anna'})-[:KNOWS]-+(:Person {nationality: 'Canadian'})


RETURN p

For more information about graph pattern matching, see Patterns.

Basic queries
https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/ads/data-analysis.adoc

This page contains information about how to create, query, and delete a graph database using Cypher. For
more advanced queries, see the section on Subqueries.

The examples below uses the publicly available Neo4j movie database.

8
Creating a data model
Before creating a property graph database, it is important to develop an appropriate data model. This will
provide structure to the data, and allow users of the graph to efficiently retrieve the information they are
looking for.

The following data model is used for the Neo4j data model:

It includes two types of node labels:

• Person nodes, which have the following properties: name and born.

• Movie nodes, which have the following properties: title, released, and tagline.

The data model also contains five different relationship types between the Person and Movie nodes:
ACTED_IN, DIRECTED, PRODUCED, WROTE, and REVIEWED. Two of the relationship types have properties:

• The ACTED_IN relationship type, which has the roles property.

• The REVIEWED relationship type, which has a summary property and a rating property.

To learn more about data modelling for graph databases, enroll in the free Graph Data Modelling
Fundamentals course offered by GraphAcademy.

Creating a property graph database


The complete Cypher query to create the Neo4j movie database, can be found here. To create the full
graph, run the full query against an empty Neo4j database.

Finding nodes
The MATCH clause is used to find a specific pattern in the graph, such as a specific node. The RETURN clause
specifies what of the found graph pattern to return.

For example, this query will find the nodes with Person label and the name Keanu Reeves, and return the
name and born properties of the found nodes:

Query

MATCH (keanu:Person {name:'Keanu Reeves'})


RETURN keanu.name AS name, keanu.born AS born

9
Table 1. Result

name born

"Keanu Reeves" 1964

Rows: 1

It is also possible to query a graph for several nodes. This query matches all nodes with the Person label,
and limits the results to only include five rows.

Query

MATCH (people:Person)
RETURN people
LIMIT 5

Table 2. Result

people

{"born":1964,"name":"Keanu Reeves"}

{"born":1967,"name":"Carrie-Anne Moss"}

{"born":1961,"name":"Laurence Fishburne"}

{"born":1960,"name":"Hugo Weaving"}

{"born":1967,"name":"Lilly Wachowski"}

Rows: 5

Note on clause composition


Similar to SQL, Cypher queries are constructed using various clauses which are chained together to feed
intermediate results between each other. Each clause has as input the state of the graph and a table of
intermediate results consisting of the referenced variables. The first clause takes as input the state of the
graph before the query and an empty table of intermediate results. The output of a clause is a new state of
the graph and a new table of intermediate results, serving as input to the next clause. The output of the
last clause is the result of the query.

Note that if one of the clauses returns an empty table of intermediate results, there is nothing to pass on to
subsequent clauses, thus ending the query. (There are ways to circumvent this behaviour. For example, by
replacing a MATCH clause with OPTIONAL MATCH.)

In the below example, the first MATCH clause finds all nodes with the Person label. The second clause will
then filter those nodes to find all Person nodes who were born in the 1980s. The final clause returns the
result in a descending chronological order.

Query

MATCH (bornInEighties:Person)
WHERE bornInEighties.born >= 1980 AND bornInEighties.born < 1990
RETURN bornInEighties.name as name, bornInEighties.born as born
ORDER BY born DESC

10
Table 3. Result

name born

"Emile Hirsch" 1985

"Rain" 1982

"Natalie Portman" 1981

"Christina Ricci" 1980

Rows: 4

For more details, see the section on Clause composition.

Finding connected nodes


To discover how nodes are connected to one another, relationships must be added to queries. Queries can
specify relationship types, properties, and direction, as well as the start and end nodes of the pattern.

For example, the following query matches the graph for the director of the movie the Matrix, and returns
the name property of its directors.

Query

MATCH (m:Movie {title: 'The Matrix'})<-[d:DIRECTED]-(p:Person)


RETURN p.name as director

Table 4. Result

director

"Lilly Wachowski"

"Lana Wachowski"

Rows: 2

It also possible to look for the type of relationships that connect nodes to one another. The below query
searches the graph for outgoing relationships from the Tom Hanks node to any Movie nodes, and returns
the relationships and the titles of the movies connected to him.

Query

MATCH (tom:Person {name:'Tom Hanks'})-[r]->(m:Movie)


RETURN type(r) AS type, m.title AS movie

The result shows that he has 13 outgoing relationships connected to 12 different Movie nodes (12 have
the ACTED_IN type and one has the DIRECTED type).

11
The Da Joe
Vinci Versus
Code Vo…
That
Thing You
Do

You've
M…

ACTED_IN

IN
DI

_
ED
RE

T
CT
Cloud

AC
AC

ED
Atlas TE
IN
D_
D
TE
_I
N
AC AC A League
TE
D_
IN O…
D_ IN
ACTE
Tom
Cast ACTED_IN Hanks
Away ACT
ED_
IN
_IN
TED
AC AC
Apollo 13
_IN

ACTED_IN

T ED
D
TE

_I
N
The
AC

Green
Mile

Charlie
Wilson's
Sleepless Wa
in Seattl The Polar
Express

Table 5. Result

type movie

"ACTED_IN" "Apollo 13"

"ACTED_IN" "You’ve Got Mail"

"ACTED_IN" "A League of Their Own"

"ACTED_IN" "That Thing You Do"

"ACTED_IN" "The Da Vinci Code"

"ACTED_IN" "Cloud Atlas"

"ACTED_IN" "Joe versus the Volcano"

"ACTED_IN" "Cast Away"

"ACTED_IN" "The Green Mile"

"ACTED_IN" "Sleepless in Seattle"

"ACTED_IN" "The Polar Express"

"ACTED_IN" "Charlie Wilson’s War"

"DIRECTED" "That Thing You Do"

Rows: 13

It is possible to further modify Cypher queries by adding label expressions to the clauses. For example, the
below query uses a NOT label expression (!) to return all relationships connected to Tom Hanks that are not
of type ACTED_IN.

12
Query

MATCH (:Person {name:'Tom Hanks'})-[r:!ACTED_IN]->(m:Movie)


Return type(r) AS type, m.title AS movies

Table 6. Result

type movie

"DIRECTED" "That Thing You Do"

Rows: 1

For more information about the different label expressions supported by Cypher, see the section on label
expressions.

Finding paths
There are several ways in which Cypher can be used to search a graph for paths between nodes.

To search for patterns of a fixed length, specify the distance (hops) between the nodes in the pattern by
using a quantifier ({n}). For example, the following query matches all Person nodes exactly 2 hops away
from Tom Hanks and returns the first five rows. The DISTINCT operator ensures that the result contain no
duplicate values.

Query

MATCH (tom:Person {name:'Tom Hanks'})--{2}(colleagues:Person)


RETURN DISTINCT colleagues.name AS name, colleagues.born AS bornIn
ORDER BY bornIn
LIMIT 5

Table 7. Result

name bornIn

"Mike Nichols" 1931

"Ian McKellen" 1939

"James Cromwell" 1940

"Nora Ephron" 1941

"Penny Marshall" 1943

Rows: 5

It is also possible to match a graph for patterns of a variable length. The below query matches all Person
nodes between 1 and 4 hops away from Tom Hanks and returns the first five rows.

Query

MATCH (p:Person {name:'Tom Hanks'})--{1,4}(colleagues:Person)


RETURN DISTINCT colleagues.name AS name, colleagues.born AS bornIn
ORDER BY bornIn, name
LIMIT 5

13
Table 8. Result

name bornIn

"Max von Sydow" 1929

"Clint Eastwood" 1930

"Gene Hackman" 1930

"Richard Harris" 1930

"Mike Nichols" 1931

Rows: 5

The quantifier used in the above two examples was introduced with the release of
quantified path patterns in Neo4j 5.9. Before that, the only way in Cypher to match

 paths of a variable length was with a variable-length relationship. This syntax is still
available in Cypher, but it is not GQL conformant. For more information, see Patterns →
Syntax and semantics → Variable length relationships.

The SHORTEST keyword can be used to find a variation of the shortest paths between two nodes. In this
example, ALL SHORTEST paths between the two nodes Keanu Reeves and Tom Cruise are found. The
count() function calculates the number of these shortest paths while the length() function calculates the
length of each path in terms of traversed relationships.

Query

MATCH p = ALL SHORTEST (:Person {name:"Keanu Reeves"})--+(:Person {name:"Tom Cruise"})


RETURN count(p) AS pathCount, length(p) AS pathLength

The results show that 2 different paths are tied for the shortest length.

Table 9. Result

pathCount pathLength

2 4

Rows: 1

The SHORTEST keyword was introduced in Neo4j 5.21, and functionally replaces and
extendes the shortestPath() and allShortestPaths() functions. Both functions can still
 be used, but they are not GQL conformant. For more information, see Patterns → Syntax
and semantics → The shortestPath() and allShortestPaths() functions.

For more information about graph pattern matching, see Patterns.

Finding recommendations
Cypher allows for more complex queries. The following query tries to recommend co-actors for Keanu
Reeves, who he has yet to work with but who his co-actors have worked with. The query then orders the
results by how frequently a matched co-co-actor has collaborated with one of Keanu Reeves' co-actors.

14
Query

MATCH (keanu:Person {name:'Keanu Reeves'})-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(coActors:Person),


(coActors:Person)-[:ACTED_IN]->(m2:Movie)<-[:ACTED_IN]-(cocoActors:Person)
WHERE NOT (keanu)-[:ACTED_IN]->()<-[:ACTED_IN]-(cocoActors) AND keanu <> cocoActors
RETURN cocoActors.name AS recommended, count(cocoActors) AS strength
ORDER BY strength DESC
LIMIT 7

Table 10. Result

recommended strength

"Tom Hanks" 4

"John Hurt" 3

"Jim Broadbent" 3

"Halle Berry" 3

"Stephen Rea" 3

"Natalie Portman" 3

"Ben Miles" 3

Rows: 5

There are several connections between the Keanu Reeves and Tom Hanks nodes in the movie database, but
the two have never worked together in a film. The following query matches coactors who could introduce
the two, by looking for co-actors who have worked with both of them in separate movies:

Query

MATCH (:Person {name: 'Keanu Reeves'})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coActor:Person),


(coActor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(:Person {name:'Tom Hanks'})
RETURN DISTINCT coActor.name AS coActor

Table 11. Result

coActor

"Charlize Theron"

"Hugo Weaving"

Rows: 2

Delete a graph
To delete all nodes and relationships in a graph, run the following query:

MATCH (n)
DETACH DELETE n

For more information, see the section on the DELETE clause.

15
Cypher expressions
This page contains examples of allowed expressions in Cypher.

General
• A variable: n, x, rel, myFancyVariable, `A name with special characters in it[]!`.

• A property: n.prop, x.prop, rel.thisProperty, myFancyVariable.`(special property name)`.

• A dynamic property: n["prop"], rel[n.city + n.zip], map[coll[0]].

• A parameter: $param, $0.

• A list of expressions: ['a', 'b'], [1, 2, 3], ['a', 2, n.property, $param], [].

• A function call: length(p), nodes(p).

• An aggregate function call: avg(x.prop), count(*).

• A path-pattern: (a)-[r]->(b), (a)-[r]-(b), (a)--(b), (a)-->()<--(b).

• An operator application: 1 + 2, 3 < 4.

• A subquery expression: COUNT {}, COLLECT {}, EXISTS {}, CALL {}.

• A regular expression: a.name =~ 'Tim.*'.

• A CASE expression.

• null.

Expressions containing unsanitized user input may make your application vulnerable to

 Cypher injection. Consider using parameters instead. Learn more in Protecting against
Cypher Injection.

Most expressions in Cypher evaluate to null if any of their inner expressions are null.

 Notable exceptions are the operators IS NULL, IS NOT NULL, and the type predicate
expressions.

Numerical
• A numeric (INTEGER or FLOAT) literal: 13, -40000, 3.14.

• A numeric (INTEGER or FLOAT) literal in scientific notation: 6.022E23.

• A hexadecimal INTEGER literal (starting with 0x): 0x13af, 0xFC3A9, -0x66eff.

• An octal INTEGER literal (starting with 0o): 0o1372, -0o5671.

• A FLOAT literal: Inf, Infinity, NaN.

• null.

Any numeric literal may contain an underscore _ between digits. There may be an
 underscore between the 0x or 0o and the digits for hexadecimal and octal literals.

16
String
• A STRING literal: 'Hello', "World".

• A case-sensitive STRING matching expression: a.surname STARTS WITH 'Sven', a.surname ENDS WITH
'son' or a.surname CONTAINS 'son'.

• null.

String literal escape sequences


String literals can contain the following escape sequences:

Escape sequence Character

\t Tab

\b Backspace

\n Newline

\r Carriage return

\f Form feed

\' Single quote

\" Double quote

\\ Backslash

\uxxxx Unicode UTF-16 code point (4 hex digits must follow the \u)

Boolean
• A BOOLEAN literal: true, false.

• A predicate expression (i.e. an expression returning a BOOLEAN value): a.prop = 'Hello', length(p) >
10, a.name IS NOT NULL.

• Label and relationship type expressions: (n:A|B), ()-[r:R1|R2]->().

• null.

Conditional expressions (CASE)


Generic conditional expressions can be expressed in Cypher using the CASE construct. Two variants of CASE
exist within Cypher: the simple form, to compare a single expression against multiple values, and the
generic form, to express multiple conditional statements.

CASE can only be used as part of RETURN or WITH if you want to use the result in a
 subsequent clause.

17
Example graph
The following graph is used for the examples below:

Person

name: 'Alice'
KN
age: 38
S

O
W

eyes: 'brown' W
O

S
KN

name: 'Bob' name: 'Charlie'


Person age: 25 age: 53 Person
eyes: 'blue' eyes: 'green'

MARRIED
KNOWS

S
W
KNO

name: 'Eskil'
name: 'Daniel'
Person eyes: 'brown'
age: 41 Person
eyes: 'blue'

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(alice:Person {name:'Alice', age: 38, eyes: 'brown'}),
(bob:Person {name: 'Bob', age: 25, eyes: 'blue'}),
(charlie:Person {name: 'Charlie', age: 53, eyes: 'green'}),
(daniel:Person {name: 'Daniel', eyes: 'brown'}),
(eskil:Person {name: 'Eskil', age: 41, eyes: 'blue'}),
(alice)-[:KNOWS]->(bob),
(alice)-[:KNOWS]->(charlie),
(bob)-[:KNOWS]->(daniel),
(charlie)-[:KNOWS]->(daniel),
(bob)-[:MARRIED]->(eskil)

Simple CASE
The simple CASE form is used to compare a single expression against multiple values, and is analogous to
the switch construct of programming languages. The expressions are evaluated by the WHEN operator until
a match is found. If no match is found, the expression in the ELSE operator is returned. If there is no ELSE

18
case and no match is found, null will be returned.

Syntax

CASE test
WHEN value [, value]* THEN result
[WHEN ...]
[ELSE default]
END

Arguments:

Name Description

test An expression.

value An expression whose result will be compared to test.

result The expression returned as output if value matches test

default The expression to return if no value matches the test expression.

Example

MATCH (n:Person)
RETURN
CASE n.eyes
WHEN 'blue' THEN 1
WHEN 'brown', 'hazel' THEN 2
ELSE 3
END AS result, n.eyes

result n.eyes

2 "brown"

1 "blue"

3 "green"

2 "brown"

1 "blue"

Rows: 5

Extended Simple CASE Label—new 5.18


The extended simple CASE form allows the comparison operator to be specified explicitly. The simple CASE
uses an implied equals (=) comparator.

The supported comparators are:

• Regular Comparison Operators: =, <>, <, >, <=, >=

• IS NULL Operator: IS [NOT] NULL

• Type Predicate Expression: IS [NOT] TYPED <TYPE> (Note that the form IS [NOT] :: <TYPE> is not

19
accepted)

• Normalization Predicate Expression: IS [NOT] NORMALIZED

• String Comparison Operators: STARTS WITH, ENDS WITH, =~ (regex matching)

Syntax

CASE test
WHEN [comparisonOperator] value [, [comparisonOperator] value ]* THEN result
[WHEN ...]
[ELSE default]
END

Arguments:

Name Description

test An expression.

comparisonOperator One of the supported comparison operators.

value An expression whose result is compared to test using the given comparison
operator.

result The expression returned as output if value matches test.

default The expression to return if no value matches the test expression.

Example

MATCH (n:Person)
RETURN n.name,
CASE n.age
WHEN IS NULL, IS NOT TYPED INTEGER | FLOAT THEN "Unknown"
WHEN = 0, = 1, = 2 THEN "Baby"
WHEN <= 13 THEN "Child"
WHEN < 20 THEN "Teenager"
WHEN < 30 THEN "Young Adult"
WHEN > 1000 THEN "Immortal"
ELSE "Adult"
END AS result

n.name result

"Alice" "Adult"

"Bob" "Young Adult"

"Charlie" "Adult"

"Daniel" "Unknown"

"Eskil" "Adult"

Rows: 5

Generic CASE
The generic CASE expression supports multiple conditional statements, and is analogous to the if-elseif-

20
else construct of programming languages. Each row is evaluated in order until a true value is found. If no
match is found, the expression in the ELSE operator is returned. If there is no ELSE case and no match is
found, null will be returned.

Syntax

CASE
WHEN predicate THEN result
[WHEN ...]
[ELSE default]
END

Arguments:

Name Description

predicate A predicate is an expression that evaluates to a BOOLEAN value. In this case, the
predicate is tested to find a valid alternative.

result The expression returned as output if predicate evaluates to true.

default If no match is found, default is returned.

Example

MATCH (n:Person)
RETURN
CASE
WHEN n.eyes = 'blue' THEN 1
WHEN n.age < 40 THEN 2
ELSE 3
END AS result, n.eyes, n.age

result n.eyes n.age

2 "brown" 38

1 "blue" 25

3 "green" 53

3 "brown" null

1 "blue" 41

Rows: 5

CASE with null values


When working with null values, you may be forced to use the generic CASE form. The two examples below
use the age property of the Daniel node (which has a null value for that property) to clarify the difference.

21
Simple CASE

MATCH (n:Person)
RETURN n.name,
CASE n.age ①
WHEN null THEN -1 ②
ELSE n.age - 10 ③
END AS age_10_years_ago

① n.age is the expression being evaluated. Note that the node Daniel has a null value as age.

② This branch is skipped, because null does not equal any other value, including null itself.

③ The execution takes the ELSE branch, which outputs null because n.age - 10 equals null.

n.name age_10_years_ago

"Alice" 28

"Bob" 15

"Charlie" 43

"Daniel" null

"Eskil" 31

Rows: 5

Generic CASE

MATCH (n:Person)
RETURN n.name,
CASE ①
WHEN n.age IS NULL THEN -1 ②
ELSE n.age - 10
END AS age_10_years_ago

① If no expression is provided after CASE, it acts in its generic form, supporting predicate expressions in
each branch.

② This predicate expression evaluates to true for the node Daniel, so the result from this branch is
returned.

n.name age_10_years_ago

"Alice" 28

"Bob" 15

"Charlie" 43

"Daniel" -1

"Eskil" 31

Rows: 5

For more information about null, see Working with null.

CASE expressions and succeeding clauses


The results of a CASE expression can be used to set properties on a node or relationship.

22
MATCH (n:Person)
WITH n,
CASE n.eyes
WHEN 'blue' THEN 1
WHEN 'brown' THEN 2
ELSE 3
END AS colorCode
SET n.colorCode = colorCode
RETURN n.name, n.colorCode

n.name n.colorCode

"Alice" 2

"Bob" 1

"Charlie" 3

"Daniel" 2

"Eskil" 1

Rows: 5

For more information about using the SET clause, see SET.

Further considerations
CASE result branches are statically checked prior to execution. This means that if a branch is not
semantically correct, it will still throw an exception, even if that branch may never be executed during
runtime.

In the following example, date is statically known to be a STRING value, and therefore would fail if treated
as a DATE value.

Not allowed

WITH "2024-08-05" AS date, "string" AS type


RETURN CASE type
WHEN "string" THEN datetime(date)
WHEN "date" THEN datetime({year: date.year, month: date.month, day: date.day})
ELSE datetime(date)
END AS dateTime

Error message

Type mismatch: expected Map, Node, Relationship, Point, Duration, Date, Time, LocalTime, LocalDateTime or
DateTime but was String (line 4, column 38 (offset: 136))
" WHEN 'date' THEN datetime({year: date.year, month: date.month, day: date.day})"
^

23
Clauses
This section contains information on all the clauses in the Cypher query language.

Reading clauses
These comprise clauses that read data from the database.

The flow of data within a Cypher query is an unordered sequence of maps with key-value pairs — a set of
possible bindings between the variables in the query and values derived from the database. This set is
refined and augmented by subsequent parts of the query.

Clause Description

MATCH Specify the patterns to search for in the database.

OPTIONAL MATCH Specify the patterns to search for in the database while using
nulls for missing parts of the pattern.

Projecting clauses
These comprise clauses that define which expressions to return in the result set. The returned expressions
may all be aliased using AS.

Clause Description

RETURN … [AS] Defines what to include in the query result set.

WITH … [AS] Allows query parts to be chained together, piping the results
from one to be used as starting points or criteria in the next.

UNWIND … [AS] Expands a list into a sequence of rows.

FINISH Defines a query to have no result.

Reading sub-clauses
These comprise sub-clauses that must operate as part of reading clauses.

Sub-clause Description

WHERE Adds constraints to the patterns in a MATCH or OPTIONAL


MATCH clause or filters the results of a WITH clause.

ORDER BY [ASC[ENDING\ | DESC[ENDING\]\]] A sub-clause following RETURN or WITH, specifying that the
output should be sorted in either ascending (the default) or
descending order. As of Neo4j 5.24, it can also be used as a
standalone clause.

SKIP / OFFSET Defines from which row to start including the rows in the
output. As of Neo4j 5.24, it can be used as a standalone
clause.

24
Sub-clause Description

LIMIT Constrains the number of rows in the output. As of Neo4j


5.24, it can be used as a standalone clause.

Writing clauses
These comprise clauses that write the data to the database.

Clause Description

CREATE Create nodes and relationships.

DELETE Delete nodes, relationships or paths. Any node to


be deleted must also have all associated
relationships explicitly deleted.

DETACH DELETE Delete a node or set of nodes. All associated


relationships will automatically be deleted.

SET Update labels on nodes and properties on nodes and


relationships.

REMOVE Remove properties and labels from nodes and relationships.

FOREACH Update data within a list, whether components of a path, or


the result of aggregation.

Reading/Writing clauses
These comprise clauses that both read data from and write data to the database.

Clause Description

MERGE Ensures that a pattern exists in the graph. Either the pattern
already exists, or it needs to be created.

--- ON CREATE Used in conjunction with MERGE, this write sub-clause


specifies the actions to take if the pattern needs to be
created.

--- ON MATCH Used in conjunction with MERGE, this write sub-clause


specifies the actions to take if the pattern already exists.

CALL … [YIELD … ] Invokes a procedure deployed in the database and return any
results.

Subquery clauses

25
Clause Description

CALL { … } Evaluates a subquery, typically used for post-union


processing or aggregations.

CALL { … } IN TRANSACTIONS Evaluates a subquery in separate transactions.


Typically used when modifying or importing large
amounts of data.

Set operations
Clause Description

UNION Combines the result of multiple queries into a single


result set. Duplicates are removed.

UNION ALL Combines the result of multiple queries into a single


result set. Duplicates are retained.

Multiple graphs
Clause Description

USE Determines which graph a query, or query part, is executed


against. Fabric

Importing data
Clause Description

LOAD CSV Use when importing data from CSV files.

CALL { … } IN TRANSACTIONS This clause may be used to prevent an out-of-memory error


from occurring when importing large amounts of data using
LOAD CSV.

Listing functions and procedures


Clause Description

SHOW FUNCTIONS List the available functions.

SHOW PROCEDURES List the available procedures.

Configuration Commands

26
Clause Description

SHOW SETTINGS List configuration settings.

Transaction Commands
Clause Description

SHOW TRANSACTIONS List the available transactions.

TERMINATE TRANSACTIONS Terminate transactions by their IDs.

Reading hints
These comprise clauses used to specify planner hints when tuning a query. More details regarding the
usage of these — and query tuning in general — can be found in Planner hints and the USING keyword.

Hint Description

USING INDEX Index hints are used to specify which index, if any, the
planner should use as a starting point.

USING INDEX SEEK Index seek hint instructs the planner to use an index seek for
this clause.

USING SCAN Scan hints are used to force the planner to do a label scan
(followed by a filtering operation) instead of using an index.

USING JOIN Join hints are used to enforce a join operation at specified
points.

Index and constraint clauses


These comprise clauses to create, show, and drop indexes and constraints.

Clause Description

CREATE | SHOW | DROP INDEX Create, show or drop an index.

CREATE | SHOW | DROP CONSTRAINT Create, show or drop a constraint.

Administration clauses
Cypher includes commands to manage databases, aliases, servers, and role-based access control. To learn
more about each of these, see:

• Operations Manual → Database administration

• Operations Manual → Authentication and authorization

• Operations Manual → Clustering

27
Clause composition
This section describes the semantics of Cypher when composing different read and write clauses.

A query is made up from several clauses chained together. These are discussed in more detail in the
chapter on Clauses.

The semantics of a whole query is defined by the semantics of its clauses. Each clause has as input the
state of the graph and a table of intermediate results consisting of the current variables. The output of a
clause is a new state of the graph and a new table of intermediate results, serving as input to the next
clause. The first clause takes as input the state of the graph before the query and an empty table of
intermediate results. The output of the last clause is the result of the query.

 Unless ORDER BY is used, Neo4j does not guarantee the row order of a query result.

28
Example 1. Table of intermediate results between read clauses

The following example graph is used throughout this section.

digraph L { node [shape=record style=rounded];


N0 [
label = "{Person|name = \'John\'\l}"
]
N0 -> N3 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name = \'Joe\'\l}"
]
N1 -> N2 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N2 [
label = "{Person|name = \'Steve\'\l}"
]
N3 [
label = "{Person|name = \'Sara\'\l}"
]
N3 -> N4 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N4 [
label = "{Person|name = \'Maria\'\l}"
]
}

Now follows the table of intermediate results and the state of the graph after each clause for the
following query:

MATCH (john:Person {name: 'John'})


MATCH (john)-[:FRIEND]->(friend)
RETURN friend.name AS friendName

The query only has read clauses, so the state of the graph remains unchanged and is therefore
omitted below.

Table 12. The table of intermediate results after each clause

Clause Table of intermediate results after the clause

MATCH (john:Person {name: 'John'}) john

(:Person {name: 'John'})

29
Clause Table of intermediate results after the clause

MATCH (john)-[:FRIEND]->(friend) john friend

(:Person {name: 'John'}) (:Person {name: 'Sara'})

(:Person {name: 'John'}) (:Person {name: 'Joe'})

RETURN friend.name AS friendName friendName

'Sara'

'Joe'

The above example only looked at clauses that allow linear composition and omitted write clauses. The
next section will explore these non-linear composition and write clauses.

Read-write queries
In a Cypher query, read and write clauses can take turns. The most important aspect of read-write queries
is that the state of the graph also changes between clauses.

 A clause can never observe writes made by a later clause.

30
Example 2. Table of intermediate results and state of the graph between read and write clauses

Using the same example graph as above, this example shows the table of intermediate results and
the state of the graph after each clause for the following query:

MATCH (j:Person) WHERE j.name STARTS WITH "J"


CREATE (j)-[:FRIEND]->(jj:Person {name: "Jay-jay"})

The query finds all nodes where the name property starts with "J" and for each such node it creates
another node with the name property set to "Jay-jay".

Table 13. The table of intermediate results and the state of the graph after each clause

Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

MATCH (j:Person) WHERE j digraph L { node [shape=record


j.name STARTS WITH "J" style=rounded];
(:Person {name: 'John'}) N0 [
label = "{Person|name =
(:Person {name: 'Joe'}) \'John\'\l}"
]
N0 -> N3 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N2 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N4 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Maria\'\l}"
]
}

31
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

CREATE (j)-[:FRIEND]- j jj digraph L { node [shape=record


>(jj:Person {name: style=rounded];
"Jay-jay"}) (:Person {name: (:Person {name: N0 [
'John'}) 'Jay-jay'}) label = "{Person|name =
\'John\'\l}"
(:Person {name: (:Person {name: ]
'Joe'}) 'Jay-jay'}) N0 -> N3 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
N0 -> N5 [
color = "red"
fontcolor = "red"
label = "FRIEND\n"
]
N5 [
color = "red"
fontcolor = "red"
label = "{Person|name = \'Jay-
jay\'\l}"
]
N1 -> N6 [
color = "red"
fontcolor = "red"
label = "FRIEND\n"
]
N6 [
color = "red"
fontcolor = "red"
label = "{Person|name = \'Jay-
jay\'\l}"
]
}

It is important to note that the MATCH clause does not find the Person nodes that are created by the
CREATE clause, even though the name "Jay-jay" starts with "J". This is because the CREATE clause

32
comes after the MATCH clause and thus the MATCH can not observe any changes to the graph made by
the CREATE.

Queries with UNION


UNION queries are slightly different because the results of two or more queries are put together, but each
query starts with an empty table of intermediate results.

In a query with a UNION clause, any clause before the UNION cannot observe writes made by a clause after
the UNION. Any clause after UNION can observe all writes made by a clause before the UNION. This means
that the rule that a clause can never observe writes made by a later clause still applies in queries using
UNION.

33
Example 3. Table of intermediate results and state of the graph in a query with UNION

Using the same example graph as above, this example shows the table of intermediate results and
the state of the graph after each clause for the following query:

CREATE (jj:Person {name: "Jay-jay"})


RETURN count(*) AS count
UNION
MATCH (j:Person) WHERE j.name STARTS WITH "J"
RETURN count(*) AS count

Table 14. The table of intermediate results and the state of the graph after each clause

Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

CREATE (jj:Person jj digraph L { node [shape=record


{name: "Jay-jay"}) style=rounded];
(:Person {name: 'Jay-jay'}) N0 [
label = "{Person|name =
\'John\'\l}"
]
N0 -> N3 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
N5 [
color = "red"
fontcolor = "red"
label = "{Person|name = \'Jay-
jay\'\l}"
]
}

34
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

RETURN count(*) AS count digraph L { node [shape=record


count style=rounded];
1 N0 [
label = "{Person|name =
\'John\'\l}"
]
N0 -> N3 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
N5 [
label = "{Person|name = \'Jay-
jay\'\l}"
]
}

35
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

MATCH (j:Person) WHERE j digraph L { node [shape=record


j.name STARTS WITH "J" style=rounded];
(:Person {name: 'John'}) N0 [
label = "{Person|name =
(:Person {name: 'Joe'}) \'John\'\l}"
]
(:Person {name: 'Jay-jay'}) N0 -> N3 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N2 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N4 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Maria\'\l}"
]
N5 [
label = "{Person|name = \'Jay-
jay\'\l}"
]
}

36
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

RETURN count(*) AS count digraph L { node [shape=record


count style=rounded];
3 N0 [
label = "{Person|name =
\'John\'\l}"
]
N0 -> N3 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
N5 [
label = "{Person|name = \'Jay-
jay\'\l}"
]
}

It is important to note that the MATCH clause finds the Person node that is created by the CREATE
clause. This is because the CREATE clause comes before the MATCH clause and thus the MATCH can
observe any changes to the graph made by the CREATE.

Queries with CALL {} subqueries


Subqueries inside a CALL {} clause are evaluated for each incoming input row. This means that write
clauses inside a subquery can get executed more than once. The different invocations of the subquery are
executed in turn, in the order of the incoming input rows.

Later invocations of the subquery can observe writes made by earlier invocations of the subquery.

37
Example 4. Table of intermediate results and state of the graph in a query with CALL {}

Using the same example graph as above, this example shows the table of intermediate results and
the state of the graph after each clause for the following query:

The below query uses a variable scope clause (introduced in Neo4j 5.23) to import

 variables into the CALL subquery. If you are using an older version of Neo4j, use an
importing WITH clause instead.

MATCH (john:Person {name: 'John'})


SET john.friends = []
WITH john
MATCH (john)-[:FRIEND]->(friend)
WITH john, friend
CALL (john, friend) {
WITH john.friends AS friends
SET john.friends = friends + friend.name
}

Table 15. The table of intermediate results and the state of the graph after each clause

38
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

MATCH (john:Person john digraph L { node [shape=record


{name: 'John'}) style=rounded];
(:Person {name: 'John'}) N0 [
label = "{Person|name =
\'John\'\l}"
]
N0 -> N3 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N0 -> N1 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N1 [

color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N2 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N4 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Maria\'\l}"
]
}

39
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

SET john.friends = [] john digraph L { node [shape=record


style=rounded];
(:Person {name: 'John', friends: N0 [
[]}) color = "red"
fontcolor = "red"
label = "{Person|name =
\'John\'\l|friends = []\l}"
]
N0 -> N3 [
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
}

40
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

MATCH (john)-[:FRIEND]- john friend digraph L { node [shape=record


>(friend) style=rounded];
(:Person {name: (:Person {name: N0 [
'John', friends: 'Sara'}) label = "{Person|name =
[]}) \'John\'\l|friends = []\l}"
]
(:Person {name: (:Person {name: N0 -> N3 [
'John', friends: 'Joe'}) label = "FRIEND\n"
[]}) ]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N2 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
color = "grey"
fontcolor = "grey"
label = "FRIEND\n"
]
N4 [
color = "grey"
fontcolor = "grey"
label = "{Person|name =
\'Maria\'\l}"
]
}

41
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

First invocation of john friend friends digraph L { node [shape=record


style=rounded];
(:Person {name: (:Person [] N0 [
WITH john.friends AS 'John', friends: {name: label = "{Person|name =
friends []}) 'Sara'}) \'John\'\l|friends = []\l}"
]
N0 -> N3 [
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
}

42
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

First invocation of john friend friends digraph L { node [shape=record


style=rounded];
(:Person {name: (:Person [] N0 [
SET john.friends = 'John', friends: {name: color = "red"
friends + friend.name ['Sara']}) 'Sara'}) fontcolor = "red"
label = "{Person|name =
\'John\'\l|friends = ['Sara']\l}"
]
N0 -> N3 [
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
}

43
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

Second invocation of john friend friends digraph L { node [shape=record


style=rounded];
(:Person {name: (:Person ['Sara'] N0 [
WITH john.friends AS 'John', friends: {name: label = "{Person|name =
friends ['Sara']}) 'Joe'}) \'John\'\l|friends = ['Sara']\l}"
]
N0 -> N3 [
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
}

44
Clause Table of intermediate results after the State of the graph after the clause,
clause changes in red

Second invocation of john friend friends digraph L { node [shape=record


style=rounded];
(:Person {name: (:Person ['Sara'] N0 [
SET john.friends = 'John', friends: {name: color = "red"
friends + friend.name ['Sara', 'Joe']}) 'Joe'}) fontcolor = "red"
label = "{Person|name =
\'John\'\l|friends = ['Sara',
'Joe']\l}"
]
N0 -> N3 [
label = "FRIEND\n"
]
N0 -> N1 [
color = "#2e3436"
fontcolor = "#2e3436"
label = "FRIEND\n"
]
N1 [
label = "{Person|name =
\'Joe\'\l}"
]
N1 -> N2 [
label = "FRIEND\n"
]
N2 [
label = "{Person|name =
\'Steve\'\l}"
]
N3 [
label = "{Person|name =
\'Sara\'\l}"
]
N3 -> N4 [
label = "FRIEND\n"
]
N4 [
label = "{Person|name =
\'Maria\'\l}"
]
}

It is important to note that, in the subquery, the second invocation of the WITH clause could observe
the writes made by the first invocation of the SET clause.

Notes on the implementation


An easy way to implement the semantics outlined above is to fully execute each clause and materialize the
table of intermediate results before executing the next clause. This approach would consume a lot of
memory for materializing the tables of intermediate results and would generally not perform well.

Instead, Cypher will in general try to interleave the execution of clauses. This is called lazy evaluation. It
only materializes intermediate results when needed. In many read-write queries it is unproblematic to
execute clauses interleaved, but when it is not, Cypher must ensure that the table of intermediate results
gets materialized at the right time(s). This is done by inserting an Eager operator into the execution plan.

45
CALL procedure
The CALL clause is used to call a procedure deployed in the database.

The CALL clause is also used to evaluate a subquery. For more information about the CALL
 clause in this context, refer to CALL subqueries.

For information about how to list procedures, see SHOW PROCEDURES.

Neo4j comes with a number of built-in procedures. For a list of these, see Operations

 Manual → Procedures. Users can also develop custom procedures and deploy to the
database. See Java Reference → User-defined procedures for details.

Example graph
The following graph is used for the examples below:

name: 'Andy' name: 'Beatrice'


born: 1991 born: 1985

Developer KNOWS Developer


KNOWS

KNOWS

Administrator Administrator

name: 'David' name: 'Charlotte'


born: 1994 born: 1990
nationality: 'Swedish'

To recreate it, run the following query against an empty Neo4j database:

CREATE (andy:Developer {name: 'Andy', born: 1991}),


(beatrice:Developer {name: 'Beatrice', born: 1985}),
(charlotte:Administrator {name: 'Charlotte', born: 1990}),
(david:Administrator {name: 'David', born: 1994, nationality: 'Swedish'}),
(andy)-[:KNOWS]->(beatrice),
(beatrice)-[:KNOWS]->(charlotte),
(andy)-[:KNOWS]->(david)

46
Examples
Example 5. CALL a procedure without arguments

This example calls the built-in procedure db.labels(), which lists all labels used in the database.

Query

CALL db.labels()

Table 16. Result

label

"Developer"

"Administrator"

Rows: 2

It is best practice to use parentheses when calling procedures, although Cypher allows
for their omission when calling procedures of arity-0 (no arguments). Omission of
 parentheses is available only in a so-called standalone procedure call, when the whole
query consists of a single CALL clause.

Example 6. CALL a procedure without arguments

This example calls the procedure dbms.checkConfigValue(), which checks the validity of a
configuration setting value, using literal arguments.

Query

CALL dbms.checkConfigValue('server.bolt.enabled', 'true')

Table 17. Result

"valid" "message"

true "requires restart"

47
Example 7. CALL a procedure using parameters

This calls the example procedure dbms.checkConfigValue() using parameters as arguments. Each
procedure argument is taken to be the value of a corresponding statement parameter with the same
name (or null if no such parameter has been given).

Parameters

{
"setting": "server.bolt.enabled",
"value": "true"
}

Query

CALL dbms.checkConfigValue($setting, $value)

Table 18. Result

"valid" "message"

true "requires restart"

Examples that use parameter arguments shows the given parameters in JSON

 format; the exact manner in which they are to be submitted depends upon the
driver being used. See Parameters for more about querying with parameters.

Example 8. CALL a procedure using both literal and parameter arguments

This calls the example procedure dbms.checkConfigValue() using both literal and parameter
arguments.

Parameters

{
"setting": "server.bolt.enabled"
}

Query

CALL dbms.checkConfigValue($setting, 'true')

Table 19. Result

"valid" "message"

true "requires restart"

Using YIELD
The YIELD keyword is used to specify which columns of procedure metadata to return, allowing for the

48
selection and filtering of the displayed information.

Example 9. YIELD *

Using YIELD * will return all available return columns for a procedure.

Query

CALL db.labels() YIELD *

Table 20. Result

label

"Administrator"

"Developer"

Rows: 2

If the procedure has deprecated return columns, those columns are also returned.

Note that YIELD * is only valid in standalone procedure calls. For example, the following is not valid:

Not allowed

CALL db.labels() YIELD *


RETURN count(*) AS results

49
Example 10. YIELD specific procedure results and filter on them

YIELD can be used to filter for specific results. This requires knowing the names of the arguments
within a procedure’s signature, which can either be found in the Operations Manual → Procedures or
returned by a SHOW PROCEDURES query.

Find the argument names of db.propertyKeys

SHOW PROCEDURES YIELD name, signature


WHERE name = 'db.propertyKeys'
RETURN signature

Table 21. Result

signature

"db.propertyKeys() :: (propertyKey :: STRING)"

Rows: 1

It is then possible to use these argument names for further query filtering. Note that if the procedure
call is part of a larger query, its output must be named explicitly. In the below example, propertyKey is
aliased as prop and then used later in the query to count the occurrence of each property in the graph.

Filter on specific argument returned by YIELD

CALL db.propertyKeys() YIELD propertyKey AS prop


MATCH (n)
WHERE n[prop] IS NOT NULL
RETURN prop, count(n) AS numNodes

Table 22. Result

prop numNodes

"name" 4

"born" 4

"nationality" 1

Rows: 3

Note on VOID procedures


Neo4j supports the notion of VOID procedures. A VOID procedure is a procedure that does not declare any
result fields and returns no result records. VOID procedure only produces side-effects and does not allow
for the use of YIELD. Calling a VOID procedure in the middle of a larger query will simply pass on each input
record (i.e., it acts like WITH * in terms of the record stream).

Optional procedure calls Label—new 5.24


OPTIONAL CALL allows for an optional procedure call. Similar to OPTIONAL MATCH any empty rows produced
by the OPTIONAL CALL will return null.

50
Example 11. Difference between using CALL and OPTIONAL CALL

This query uses the APOC Core library), which returns all nodes connected by the given relationship
type within the specified distance (1 hop, in this case) and direction.

Regular procedure CALL

MATCH (n)
CALL apoc.neighbors.tohop(n, "KNOWS>", 1)
YIELD node
RETURN n.name AS name, collect(node.name) AS connections

Note that the result does not include the nodes in the graph without any outgoing KNOWS relationships
connected to them.

Table 23. Result

name connections

"Andy" ["Beatrice", "David"]

"Beatrice" ["Charlotte"]

Rows: 2

The same query is used below, but CALL is replaced with OPTIONAL CALL.

Optional procedure CALL

MATCH (n)
OPTIONAL CALL apoc.neighbors.tohop(n, "KNOWS>", 1)
YIELD node
RETURN n.name AS name, collect(node.name) AS connections

The result now includes the two nodes without any outgoing KNOWS relationships connected to them.

Table 24. Result

name connections

"Andy" ["Beatrice", "David"]

"Beatrice" ["Charlotte"]

"Charlotte" []

"David" []

Rows: 4

CREATE

Introduction
The CREATE clause allows you to create nodes and relationships. To define these entities, CREATE uses a
syntax similar to that of MATCH. However, while patterns only need to evaluate to either true or false, the

51
syntax for CREATE needs to specify exactly what nodes and relationships to create.

Syntax for nodes


The CREATE clause allows you to create one or more nodes. Each node can be assigned labels and
properties. You can bind each node to a variable that you can refer to later in the query. Multiple labels are
separated by colons.

Query

CREATE (charlie:Person:Actor {name: 'Charlie Sheen'}), (oliver:Person:Director {name: 'Oliver Stone'})

As of Neo4j 5.18, multiple labels can also be separated by an ampersand &, in the same manner as it is
used in label expressions. Separation by colon : and ampersand & cannot be mixed in the same clause.

Query

CREATE (charlie:Person&Actor {name: 'Charlie Sheen'}), (oliver:Person&Director {name: 'Oliver Stone'})

Both of the above queries create two nodes, bound to the variables charlie and oliver, each with a
Person label and a name property. The node representing Charlie Sheen also has the label Actor while the
node representing Oliver Stone is assigned the label Director.

Syntax for relationships


Relationships can also be created using the CREATE clause. Unlike nodes, relationships always need exactly
one relationship type and a direction. Similar to nodes, relationships can be assigned properties and
relationship types and be bound to variables.

Query

CREATE (charlie:Person:Actor {name: 'Charlie Sheen'})-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet:Movie


{title: 'Wall Street'})<-[:DIRECTED]-(oliver:Person:Director {name: 'Oliver Stone'})

This query creates the Person nodes for Charlie Sheen and Oliver Stone and the Movie node for Wall
Street. It also created the relationships of the types ACTED_IN and DIRECTED between them.

Reusing variables
The previous example created a path between the specified nodes. Note, that these newly created nodes
and relationships are not connected to what was previously in the graph. To connect them to already
existing data, bind the desired nodes and relationships to variables. These variables can then be passed
along to subsequent clauses in a query that target pre-existing elements in the graph.

Query

MATCH (charlie:Person {name: 'Charlie Sheen'}), (oliver:Person {name: 'Oliver Stone'})


CREATE (charlie)-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet:Movie {title: 'Wall Street'})<-[:DIRECTED]-
(oliver)

52
In this example, the MATCH clause finds the nodes Charlie Sheen and Oliver Stone and binds them to the
charlie and oliver variables respectively. These variables are then passed along to the subsequent
CREATE clause, which creates new relationships from the bound nodes.

You can also reuse variables from the same CREATE, both in the same or a later clause. This way, you can,
for example, define constructs that are more complex than just a linear path.

Query

CREATE p = (charlie:Person:Actor {name: 'Charlie Sheen'})-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet:Movie


{title: 'Wall Street'})<-[:DIRECTED]-(oliver:Person:Director {name: 'Oliver Stone'}), (wallStreet)<-
[:ACTED_IN {role: 'Gordon Gekko'}]-(michael:Person:Actor {name: 'Michael Douglas'})
RETURN length(p)

Creates all three nodes for Charlie Sheen, Oliver Stone and Michael Douglas and connects them all to the
node representing the Wall Street movie. It then returns the length of the path from Charlie Sheen to
Oliver Stone.

Note that when repeating a node’s variable, you may not add labels or properties to the repetition.

Query

MATCH (charlie:Person {name: 'Charlie Sheen'})


CREATE (charlie:Actor)

This query will fail because the variable charlie has already been bound to a pre-existing node, and
therefore it cannot be reused to create a new node. If you intend to add a label, use the SET clause instead.

Reusing variables in properties


The value that can be assigned to a node’s or a relationship’s property can be defined by an expression.

Query

MATCH (person:Person)
WHERE person.name IS NOT NULL
CREATE (anotherPerson:Person {name: person.name, age: $age})

This example created a Person node with the same name as another person and the age from a parameter
called age.

Such an expression may not contain a reference to a variable that is defined in the same CREATE statement.
This is to ensure that the value of a property is always clear.

Query

CREATE (charlie {score: oliver.score + 1}), (oliver {score: charlie.score + 1})

This query tries to create nodes such that Charlie’s score is higher than Oliver’s and vice versa, which is a
contradiction. The query therefore fails.

53
Use parameters with CREATE

Create node with a parameter for the properties


You can also create a graph entity from a map. All the key/value pairs in the map will be set as properties
on the created relationship or node. In this case we add a Person label to the node as well.

Parameters

{
"props": {
"name": "Andy",
"position": "Developer"
}
}

Query

CREATE (n:Person $props)


RETURN n

Table 25. Result

Node[2]{name:"Andy",position:"Developer"}

Rows: 1
Nodes created: 1
Properties set: 2
Labels added: 1

Create multiple nodes with a parameter for their properties


By providing Cypher an array of maps, it will create a node for each map.

Parameters

{
"props": [ {
"name": "Andy",
"position": "Developer"
}, {
"name": "Michael",
"position": "Developer"
} ]
}

Query

UNWIND $props AS map


CREATE (n)
SET n = map

Table 26. Result


(empty result)

54
Rows: 0
Nodes created: 2
Properties set: 4

INSERT as a synonym of CREATE Label—new 5.18


INSERT can be used as a synonym to CREATE for creating nodes and relationships, and was introduced as
part of Cypher’s GQL conformance. However, INSERT requires that multiple labels are separated by an
ampersand & and not by colon :.

Query

INSERT (tom:Person&Actor&Director {name: 'Tom Hanks'})

Creates a node, bound to the variable tom, with the labels Person, Actor, and Director and a name property.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/ads/data-analysis.adoc

DELETE
The DELETE clause is used to delete nodes, relationships or paths.

For removing properties and labels, see the REMOVE clause.

It is not possible to delete nodes with relationships connected to them without also deleting the
relationships. This can be done by either explicitly deleting specific relationships, or by using the DETACH
DELETE clause.

While the DELETE clause renders the deleted objects no longer accessible, the space
occupied by the deleted nodes and relationships remain on the disk and is reserved for
 future transactions creating data. For information about how to clear and reuse the
space occupied by deleted objects, see Operations Manual → Space reuse.

Example graph
The following graph is used for the examples below. It shows four actors, three of whom ACTED_IN the
Movie The Matrix (Keanu Reeves, Carrie-Anne Moss, and Laurence Fishburne), and one actor who did not
act in it (Tom Hanks).

55
name: Keanu Reeves name: Tom Hanks

Person Person

AC
TE
D_
name: Carrie-Anne Moss IN title: The Matrix

Person ACTED_IN Movie

IN
D_
name: Laurence Fishburne TE
AC

Person

To recreate the graph, run the following query in an empty Neo4j database:

CREATE
(keanu:Person {name: 'Keanu Reever'}),
(laurence:Person {name: 'Laurence Fishburne'}),
(carrie:Person {name: 'Carrie-Anne Moss'}),
(tom:Person {name: 'Tom Hanks'}),
(theMatrix:Movie {title: 'The Matrix'}),
(keanu)-[:ACTED_IN]->(theMatrix),
(laurence)-[:ACTED_IN]->(theMatrix),
(carrie)-[:ACTED_IN]->(theMatrix)

Delete single node


To delete a single node, use the DELETE clause:

Query

MATCH (n:Person {name: 'Tom Hanks'})


DELETE n

This deletes the Person node Tom Hanks. This query is only possible to run on nodes without any
relationships connected to them.

Result

Deleted 1 node

56
NODETACH keyword Label—new 5.14
It is also possible to delete the single node using the NODETACH DELETE clause. Using the NODETACH keyword
explicitly defines that relationships will not be detached and deleted from a node. The NODETACH keyword is
a mirror of the already existing keyword DETACH, and it was introduced as part of Cypher’s GQL
conformance. Including it is functionally the same as using simple DELETE.

Query

MATCH (n:Person {name: 'Tom Hanks'})


NODETACH DELETE n

This also deletes the Person node Tom Hanks.

Delete relationships only


It is possible to delete a relationship while leaving the node(s) connected to that relationship otherwise
unaffected.

Query

MATCH (n:Person {name: 'Laurence Fishburne'})-[r:ACTED_IN]->()


DELETE r

This deletes all outgoing ACTED_IN relationships from the Person node Laurence Fishburne, without
deleting the node.

Result

Deleted 1 relationship

Delete a node with all its relationships


To delete nodes and any relationships connected them, use the DETACH DELETE clause.

Query

MATCH (n:Person {name: 'Carrie-Anne Moss'})


DETACH DELETE n

This deletes the Person node Carrie-Anne Moss and all relationships connected to it.

Result

Deleted 1 node, deleted 1 relationship

The DETACH DELETE clause may not be permitted to users with restricted security
 privileges. For more information, see Operations Manual → Fine-grained access control.

57
Delete all nodes and relationships
It is possible to delete all nodes and relationships in a graph.

Query

MATCH (n)
DETACH DELETE n

Result

Deleted 3 nodes, deleted 1 relationship

DETACH DELETE is not suitable for deleting large amounts of data, but is useful when

 experimenting with small example datasets. To delete large amounts of data, instead
use CALL subqueries in transactions.

FINISH
A query ending in FINISH — instead of RETURN — has no result but executes all its side effects. FINISH was
introduced as part of Cypher’s GQL conformance.

The following read query successfully executes but has no results:

Query

MATCH (p:Person)
FINISH

The following query has no result but creates one node with the label Person:

Query

CREATE (p:Person)
FINISH

It is equivalent to the following query:

Query

CREATE (p:Person)

FOREACH
Lists and paths are key concepts in Cypher. The FOREACH clause can be used to update data, such as
executing update commands on elements in a path, or on a list created by aggregation.

The variable context within the FOREACH parenthesis is separate from the one outside it. This means that if
you CREATE a node variable within a FOREACH, you will not be able to use it outside of the foreach statement,
unless you match to find it.

58
Within the FOREACH parentheses, you can do any of the updating commands — SET, REMOVE, CREATE, MERGE,
DELETE, and FOREACH.

If you want to execute an additional MATCH for each element in a list then the UNWIND
 clause would be a more appropriate command.

Person
name = 'A'

KNOWS

Person
name = 'B'

KNOWS

Person
name = 'C'

KNOWS

Person
name = 'D'

Mark all nodes along a path


This query will set the property marked to true on all nodes along a path.

Query

MATCH p=(start)-[*]->(finish)
WHERE start.name = 'A' AND finish.name = 'D'
FOREACH (n IN nodes(p) | SET n.marked = true)

Table 27. Result


(empty result)

Rows: 0
Properties set: 4

LIMIT
LIMIT constrains the number of returned rows.

LIMIT accepts any expression that evaluates to a positive INTEGER and does not refer to nodes or
relationships.

Neo4j does not guarantee the results generated by LIMIT. The only clause that
 guarantees a specific row order is ORDER BY.

59
Example graph
The following graph is used for the examples below:

name: 'Andy'

Person KNOWS Person KNOWS Person

name: 'Bernard' name: 'Erika'

KN
S
W

O
O

W
KN

S
Person Person

name: 'Charlotte' name: 'David'

To recreate it, run the following query against an empty Neo4j database:

CREATE
(andy: Person {name: 'Andy'}),
(bernard: Person {name: 'Bernard'}),
(charlotte: Person {name: 'Charlotte'}),
(david: Person {name: 'David'}),
(erika: Person {name: 'Erika'}),
(andy)-[:KNOWS]->(bernard),
(andy)-[:KNOWS]->(charlotte),
(andy)-[:KNOWS]->(david),
(andy)-[:KNOWS]->(erika)

Examples

60
Example 12. Return a limited subset of the rows

To return a limited subset of the rows, use this syntax:

Query

MATCH (n)
RETURN n.name
ORDER BY n.name
LIMIT 3

Limit to 3 rows by the example query.

Table 28. Result

n.name

"Andy"

"Bernard"

"Charlotte"

Rows: 3

Example 13. Using an expression with LIMIT to return a subset of the rows

LIMIT accepts any expression that evaluates to a positive integer, as long as it can be statically
calculated (i.e. calculated before the query is run).

Query

MATCH (n)
RETURN n.name
ORDER BY n.name
LIMIT 1 + toInteger(3 * rand())

Limit 1 row plus randomly 0, 1, or 2. So randomly limit to 1, 2, or 3 rows.

Table 29. Result

n.name

"Andy"

"Bernard"

"Charlotte"

Rows: 3

LIMIT and side effects


The use of LIMIT in a query will not stop side effects, like CREATE, DELETE, or SET, from happening if the limit
is in the same query part as the side effect.

61
Query

CREATE (n)
RETURN n
LIMIT 0

This query returns nothing, but creates one node:

Table 30. Result


(empty result)

Rows: 0
Nodes created: 1

Query

MATCH (n {name: 'A'})


SET n.age = 60
RETURN n
LIMIT 0

This query returns nothing, but writes one property:

Table 31. Result


(empty result)

Rows: 0
Properties set: 1

If we want to limit the number of updates we can split the query using the WITH clause:

Query

MATCH (n)
WITH n ORDER BY n.name LIMIT 1
SET n.locked = true
RETURN n

Writes locked property on one node and return that node:

Table 32. Result

{locked:true,name:"Andy",age:60}

Rows: 1
Properties set: 1

Using LIMIT as a standalone clause Label—new 5.24


LIMIT can be used as a standalone clause, or in conjunction with ORDER BY or SKIP/OFFSET.

62
Standalone use of LIMIT

MATCH (n)
LIMIT 2
RETURN collect(n.name) AS names

Table 33. Result

names

["Andy", "Bernard"]

Rows: 1

The following query orders all nodes by name descending, skips the two first rows and limits the results to
two rows. It then collects the results in a list.

LIMIT used in conjunction with ORDER BY and SKIP

MATCH (n)
ORDER BY n.name DESC
SKIP 2
LIMIT 2
RETURN collect(n.name) AS names

Table 34. Result

names

["David", "Charlotte"]

Rows: 1

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/importing-cypher/
ad.adoc

LOAD CSV
LOAD CSV is used to import data from CSV files into a Neo4j database.

LOAD CSV FROM 'https://data.neo4j.com/bands/artists.csv' ①


AS row ②
MERGE (:Artist {name: row[1], year: toInteger(row[2])}) ③

① FROM takes a STRING containing the path where the CSV file is located.

② The clause parses one row at a time, temporarily storing the current row in the variable specified with
AS.

③ The MERGE clause accesses the row variable to insert data into the database.

LOAD CSV supports both local and remote URLs. Local paths are resolved relative to the Neo4j installation
folder.

 Loading CSV files requires load privileges.

63
Import CSV data into Neo4j

Import local files Label—not on aura


You can store CSV files on the database server and then access them by using a file:/// URL. By default,
paths are resolved relative to the Neo4j import directory.

Example 14. Import artists name and year information from a local file

artists.csv

1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV FROM 'file:///artists.csv' AS row


MERGE (a:Artist {name: row[1], year: toInteger(row[2])})
RETURN a.name, a.year

Result

a.name a.year

'ABBA' '1992'

'Roxette' '1986'

'Europe' '1979'

'The Cardigans' '1992'

4 rows

Added 4 nodes, Set 8 properties, Added 4 labels

 For ways of importing data into an Aura instance, see Aura → Importing data.

When using file:/// URLs, spaces and other non-alphanumeric characters must be
 URL-encoded.

Configuration settings for file URLs

dbms.security.allow_csv_import_from_file_urls
This setting determines whether file:/// URLs are allowed.

server.directories.import
This setting sets the root directory relative to which file:/// URLs are parsed.

64
Import from a remote location
You can import data from a CSV file hosted on a remote path.

LOAD CSV supports accessing CSV files via HTTPS, HTTP, and FTP (with or without credentials). It also
follows redirects, except those changing the protocol (for security reasons).

It is strongly recommended to permit resource loading only over secure protocols such
as HTTPS instead of insecure protocols like HTTP. This can be done by limiting the load
privileges to only trusted sources that use secure protocols. If allowing an insecure
protocol is absolutely unavoidable, Neo4j takes measures internally to enhance the
 security of these requests within their limitations. However, this means that insecure
URLs on virtual hosts will not function unless you add the JVM argument
-Dsun.net.http.allowRestrictedHeaders=true to the configuration setting
server.jvm.additional.

Example 15. Import artists name and year information from a remote file via HTTPS

https://data.neo4j.com/bands/artists.csv

1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV FROM 'https://data.neo4j.com/bands/artists.csv' AS row


MERGE (a:Artist {name: row[1], year: toInteger(row[2])})
RETURN a.name, a.year

Result

a.name a.year

'ABBA' '1992'

'Roxette' '1986'

'Europe' '1979'

'The Cardigans' '1992'

4 rows

Added 4 nodes, Set 8 properties, Added 4 labels

65
Example 16. Import artists name and year information from a remote file via FTP using credentials

ftp://<username>:<password>@<domain>/bands/artists.csv

1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV FROM 'ftp://<username>:<password>@<domain>/bands/artists.csv' AS row


MERGE (a:Artist {name: row[1], year: toInteger(row[2])})
RETURN a.name, a.year

Result

a.name a.year

'ABBA' '1992'

'Roxette' '1986'

'Europe' '1979'

'The Cardigans' '1992'

4 rows

Added 4 nodes, Set 8 properties, Added 4 labels

Import from cloud URIs label:label—enterprise-edition label—not-on-aura[]


You can import data from a number of different cloud storages:

• Azure Cloud Storage

• Google Cloud Storage

• ref:clauses/load-csv.adoc#aws-s3[AWS S3]

See Operations Manual → Load a dump from a cloud storage on how to set up access to cloud storages.

Import from an Azure Cloud Storage URI Label—new 5.24

You can import data from a CSV file hosted in an Azure Cloud Storage URI.

66
Example 17. Import artists name and year information from an Azure Cloud Storage URI

azb://azb-account/azb-container/artists.csv

1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV FROM 'azb://azb-account/azb-container/artists.csv' AS row


MERGE (a:Artist {name: row[1], year: toInteger(row[2])})
RETURN a.name, a.year

Result

a.name a.year

'ABBA' '1992'

'Roxette' '1986'

'Europe' '1979'

'The Cardigans' '1992'

4 rows

Added 4 nodes, Set 8 properties, Added 4 labels

Import from a Google Cloud Storage URI Label—new 5.21

You can import data from a CSV file hosted in a Google Cloud Storage URI.

67
Example 18. Import artists name and year information from a Google Cloud Storage URI

gs://gs-bucket/artists.csv

1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV FROM 'gs://gs-bucket/artists.csv' AS row


MERGE (a:Artist {name: row[1], year: toInteger(row[2])})
RETURN a.name, a.year

Result

a.name a.year

'ABBA' '1992'

'Roxette' '1986'

'Europe' '1979'

'The Cardigans' '1992'

4 rows

Added 4 nodes, Set 8 properties, Added 4 labels

Import from an AWS S3 URI Label—new 5.19

You can import data from a CSV file hosted in an AWS S3 URI.

68
Example 19. Import artists name and year information from an AWS S3 URI

s3://aws-bucket/artists.csv

1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV FROM 's3://aws-bucket/artists.csv' AS row


MERGE (a:Artist {name: row[1], year: toInteger(row[2])})
RETURN a.name, a.year

Result

a.name a.year

'ABBA' '1992'

'Roxette' '1986'

'Europe' '1979'

'The Cardigans' '1992'

4 rows

Added 4 nodes, Set 8 properties, Added 4 labels

Import compressed CSV files


LOAD CSV can read local CSV files compressed with ZIP or gzip. ZIP archives can have arbitrary directory
structures but may only contain a single CSV file.

Import a CSV file from within a ZIP file

LOAD CSV FROM 'file:///artists.zip' AS row


MERGE (:Artist {name: row[1], year: toInteger(row[2])})

 You can’t load zipped CSV files from remote URLs.

Import data from relational databases


If the source data comes from a relational model, it’s worth evaluating how to gain the most from moving
to a graph data model. Before running the import, think about how the data can be modeled as a graph,
and adapt its structure accordingly when running the import (see Graph data modeling).

Data from relational databases may consist of one or multiple CSV files, depending on the source database
structure. A performant approach is to run multiple passes of LOAD CSV to import nodes separately from
relationships.

69
Example 20. Import from a single CSV file

The source file books.csv contains information about both authors and books. From a graph
perspective, these are nodes with different labels, so it takes different queries to load them.

The example executes multiple passes of LOAD CSV on that one file, and each pass focuses on the
creation of one entity type.

books.csv

id,title,author,publication_year,genre,rating,still_in_print,last_purchased
19515,The Heights,Anne Conrad,2012,Comedy,5,true,2023/4/12 8:17:00
39913,Starship Ghost,Michael Tyler,1985,Science Fiction|Horror,4.2,false,2022/01/16 17:15:56
60980,The Death Proxy,Tim Brown,2002,Horror,2.1,true,2023/11/26 8:34:26
18793,Chocolate Timeline,Mary R. Robb,1924,Romance,3.5,false,2022/9/17 14:23:45
67162,Stories of Three,Eleanor Link,2022,Romance|Comedy,2,true,2023/03/12 16:01:23
25987,Route Down Below,Tim Brown,2006,Horror,4.1,true,2023/09/24 15:34:18

Query

// Create `Book` nodes


LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/books.csv' AS row
MERGE (b:Book {id: row.id, title: row.title})
MERGE (a:Author {name: row.author});

// Create `WROTE` relationships


LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/books.csv' AS row
MATCH (a:Author{name: row.author})
MATCH (b:Book{id: row.id})
MERGE (a)-[:WROTE]->(b);

Result

Added 11 nodes, Set 17 properties, Added 11 labels


Created 6 relationships

70
Example 21. Import from multiple CSV files

The file acted_in.csv contains data about the relationship between actors and the movies they acted
in (from persons.csv and movies.csv). Actors and movies are linked through their ID columns
person_tmdbId and movieId.

The file also contains the role the actor played in the movie, and it is imported in Neo4j as a
relationship property.

acted_in.csv

movieId,person_tmdbId,role
1,12899,Slinky Dog (voice)
1,12898,Buzz Lightyear (voice)
...

It takes three LOAD CSV clauses to import this dataset: the first two create Person nodes from
persons.csv and Movie nodes from movies.csv, and the third adds the :ACTED_IN relationship from
acted_in.csv.

Query

// Create person nodes


LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {name: row.name, tmdbId: row.person_tmdbId});

// Create movie nodes


LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row
MERGE (m:Movie {movieId: row.movieId, title: row.title});

// Create relationships
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/acted_in.csv' AS row
MATCH (p:Person {tmdbId: row.person_tmdbId})
MATCH (m:Movie {movieId: row.movieId})
MERGE (p)-[r:ACTED_IN {role: row.role}]->(m);

Result

Added 444 nodes, Set 888 properties, Added 444 labels


Added 93 nodes, Set 186 properties, Added 93 labels
Created 372 relationships, Set 372 properties

For a guide on importing the Northwind dataset from Postgres into Neo4j, see Tutorial:
 Import data from a relational database into Neo4j in the Getting Started Guide.

Create additional node labels

In Neo4j a node can have multiple labels, while in a relational setting it’s not as straightforward to mix
entities. For example, a node in Neo4j can be labeled both Dog and Actor, while in a relational model dogs
and actors are separate entities.

After a relational dataset has been imported, there may be further labels that can be added, depending on
the use case. Additional labels can speed up pinpointing a node if you use them in your queries.

71
Example 22. Add extra Actor label on Person nodes

The :ACTED_IN relationship from acted_in.csv implicitly defines actors as a subset of people. The
following queries adds an additional Actor label to all people who have an outgoing :ACTED_IN
relationship.

Query

MATCH (p:Person)-[:ACTED_IN]->()
WITH DISTINCT p
SET p:Actor

Result

Added 353 labels

Pre-process the data during import

Cast CSV columns to Neo4j data types


LOAD CSV inserts all imported CSV data as STRING properties. However, Neo4j supports a range of data
types, and storing data with appropriate types allows both to query it more effectively and to process it
with type-specific Cypher functions.

72
Example 23. Import numeric and temporal data

The column person_tmdbId and born in the file persons.csv contains INTEGER and DATE values
respectively. The functions toInteger() and date() allow to cast those values to the appropriate
types before importing them.

persons.csv

person_tmdbId,bio,born,bornIn,died,person_imdbId,name,person_poster,person_url
3,"Legendary Hollywood Icon Harrison Ford was born on July 13, 1942 in Chicago, Illinois. His
family history includes a strong lineage of actors, radio personalities, and models. Harrison
attended public high school in Park Ridge, Illinois where he was a member of the school Radio Station
WMTH. Harrison worked as the lead voice for sports reporting at WMTH for several years. Acting
wasn’t a major interest to Ford until his junior year at Ripon College when he first took an acting
class...",1942-07-13,"Chicago, Illinois, USA",,148,Harrison
Ford,https://image.tmdb.org/t/p/w440_and_h660_face/5M7oN3sznp99hWYQ9sX0xheswWX.jpg,https://themoviedb
.org/person/3
...

Query

LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row


MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET p.name = row.name, p.born = date(row.born)
RETURN
p.name AS name,
p.tmdbId AS tmdbId,
p.born AS born
LIMIT 5

Result

name tmdbId born

'Harrison Ford' 3 1942-07-13

'Tom Hanks' 31 1956-07-09

'Robin Wright' 32 1966-04-08

'Sally Field' 35 1946-11-06

'Sean Bean' 48 1959-04-17

5 rows

Added 444 nodes, Set 1332 properties, Added 444 labels

For a list of type casting functions, see Casting data values.

Handle null values


Neo4j does not store null values. null or empty fields in a CSV files can be skipped or replaced with
default values in LOAD CSV.

73
Example 24. Processing a file with null values

In the file companies.csv, some rows do not specify values for some columns. The examples show
several options of how to handle null values.

companies.csv

Id,Name,Location,Email,BusinessType
1,Neo4j,San Mateo,[email protected],P
2,AAA,,[email protected],
3,BBB,Chicago, ,G
,CCC,Michigan,[email protected],G

Skip null values

LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row


WITH row
WHERE row.Id IS NOT NULL
MERGE (c:Company {id: row.Id})

Provide a default for null values

LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row


WITH row
WHERE row.Id IS NOT NULL
MERGE (c:Company {id: row.Id, hqLocation: coalesce(row.Location, "Unknown")})

Change empty STRING values to null values (not stored)

LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row


WITH row
WHERE row.Id IS NOT NULL
MERGE (c:Company {id: row.Id})
SET c.email = nullIf(trim(row.Email), "")

null values are not stored in the database. A strategy for selectively getting rid of some

 values is to map them into null values. The empty STRING values from the last query
serve as an example.

Split list values


The function split() allows to convert a STRING of elements into a list.

74
Example 25. Parse movies languages and genres as lists

The file movies.csv contains a header line and a total of 94 lines.

The columns languages and genres contain list-like values. Both are separated by a pipe |, and
split() allows to make them into Cypher lists ahead of inserting them into the database.

movies.csv

movieId,title,budget,countries,movie_imdbId,imdbRating,imdbVotes,languages,plot,movie_poster,released
,revenue,runtime,movie_tmdbId,movie_url,year,genres
1,Toy Story,30000000.0,USA,114709,8.3,591836,English,A cowboy doll is profoundly threatened and
jealous when a new spaceman figure supplants him as top toy in a boy's
room.,https://image.tmdb.org/t/p/w440_and_h660_face/uXDfjJbdP4ijW5hWSBrPrlKpxab.jpg,1995-11-
22,373554033.0,81,862,https://themoviedb.org/movie/862,1995,Adventure|Animation|Children|Comedy|Fanta
sy
2,Jumanji,65000000.0,USA,113497,6.9,198355,English|French,"When two kids find and play a magical
board game, they release a man trapped for decades in it and a host of dangers that can only be
stopped by finishing the
game.",https://image.tmdb.org/t/p/w440_and_h660_face/vgpXmVaVyUL7GGiDeiK1mKEKzcX.jpg,1995-12-
15,262797249.0,104,8844,https://themoviedb.org/movie/8844,1995,Adventure|Children|Fantasy
...

Query

LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row


MERGE (m:Movie {id: toInteger(row.movieId)})
SET
m.title = row.title,
m.imdbId = toInteger(row.movie_imdbId),
m.languages = split(row.languages, '|'),
m.genres = split(row.genres, '|')
RETURN
m.title AS title,
m.imdbId AS imdbId,
m.languages AS languages,
m.genres AS genres
LIMIT 5

Result

title imdbId languages genres

'Toy Story' 114709 ['English'] ['Adventure',


'Animation', 'Children',
'Comedy', 'Fantasy']

'Jumanji' 113497 ['English', 'French'] ['Adventure', 'Children',


'Fantasy']

'Grumpier Old Men' 113228 ['English'] ['Comedy', 'Romance"]

'Waiting to Exhale' 114885 ['English'] ['Comedy', 'Romance',


'Drama']

'Father of the Bride Part 113041 ['English'] ['Comedy']


II'

5 rows

Added 93 nodes, Set 465 properties, Added 93 labels

For more STRING manipulation functions, see String functions.

75
Recommendations

Create property uniqueness constraints


Always create property uniqueness constraints prior to importing data, to avoid duplicates or colliding
entities. If the source file contains duplicated data and the right constraints are in place, Cypher raises an
error.

Example 26. Create a node property uniqueness constraints on person ID

persons.csv

person_tmdbId,bio,born,bornIn,died,person_imdbId,name,person_poster,person_url
3,"Legendary Hollywood Icon Harrison Ford was born on July 13, 1942 in Chicago, Illinois. His
family history includes a strong lineage of actors, radio personalities, and models. Harrison
attended public high school in Park Ridge, Illinois where he was a member of the school Radio Station
WMTH. Harrison worked as the lead voice for sports reporting at WMTH for several years. Acting
wasn’t a major interest to Ford until his junior year at Ripon College when he first took an acting
class...",1942-07-13,"Chicago, Illinois, USA",,148,Harrison
Ford,https://image.tmdb.org/t/p/w440_and_h660_face/5M7oN3sznp99hWYQ9sX0xheswWX.jpg,https://themoviedb
.org/person/3
...

Create a node property uniqueness constraint on person ID

CREATE CONSTRAINT Person_tmdbId IF NOT EXISTS


FOR (p:Person) REQUIRE p.tmdbId IS UNIQUE

Result

Added 1 constraints

Handle large amounts of data


LOAD CSV may run into memory issues with files containing a significant number of rows (approaching
hundreds of thousands or millions). For large files, it’s recommended to split the import process in several
lighter transactions through the clause CALL {…} IN TRANSACTIONS.

76
Example 27. Load a large CSV file in several transactions

The file persons.csv contains a header line and a total of 869 lines. The example loads the name and
born columns in transactions of 200 rows.

persons.csv

person_tmdbId,bio,born,bornIn,died,person_imdbId,name,person_poster,person_url
3,"Legendary Hollywood Icon Harrison Ford was born on July 13, 1942 in Chicago, Illinois. His
family history includes a strong lineage of actors, radio personalities, and models. Harrison
attended public high school in Park Ridge, Illinois where he was a member of the school Radio Station
WMTH. Harrison worked as the lead voice for sports reporting at WMTH for several years. Acting
wasn’t a major interest to Ford until his junior year at Ripon College when he first took an acting
class...",1942-07-13,"Chicago, Illinois, USA",,148,Harrison
Ford,https://image.tmdb.org/t/p/w440_and_h660_face/5M7oN3sznp99hWYQ9sX0xheswWX.jpg,https://themoviedb
.org/person/3
...

The below query uses a variable scope clause (introduced in Neo4j 5.23) to import

 variables into the CALL subquery. If you are using an older version of Neo4j, use an
importing WITH clause instead.

Query

LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row


CALL (row) {
MERGE (p:Person {tmdbId: row.person_tmdbId})
SET p.name = row.name, p.born = row.born
} IN TRANSACTIONS OF 200 ROWS

Result

Added 444 nodes, Set 1332 properties, Added 444 labels

In case of errors, CALL {…} IN TRANSACTIONS may only import a part of the CSV data as
the transactions are committed. For example, if the first 200 rows are error free, they are
 committed. If the next 200 rows contain data that causes an error, the second
transaction fails, but leaves the first transaction unaffected.

LOAD CSV and Neo4j functions

Access line numbers with linenumber()


The linenumber() function provides the line number which LOAD CSV is operating on, or null if called
outside of a LOAD CSV context.

A common use case for this function is to generate sequential unique IDs for CSV data that doesn’t have a
unique column already.

77
Example 28. linenumber()

artists.csv

1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV FROM 'file:///artists.csv' AS row


RETURN linenumber() AS number, row

Result

number row

1 ["1","ABBA","1992"]

2 ["2","Roxette","1986"]

3 ["3","Europe","1979"]

4 ["4","The Cardigans","1992"]

4 rows

Access the CSV file path with file()


The file() function provides the absolute path of the file that LOAD CSV is operating on, or null if called
out of a LOAD CSV context.

Example 29. file()

artists.csv

1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV FROM 'file:///artists.csv' AS row


RETURN DISTINCT file() AS path

Result

path

'/artists.csv'

file() always returns a local path, even when loading remote CSV files. For remote
 resources, file() returns the temporary local path it was downloaded to.

78
CSV file format
The CSV file format and LOAD CSV interact as follows:

• The file character encoding must be UTF-8.

• The line terminator is system dependent (\n for Unix and \r\n for Windows).

• The default field delimiter is ,. Change it with the option FIELDTERMINATOR.

• CSV files may contain quoted STRING values, and the quotes are dropped when LOAD CSV reads the
data.

• If dbms.import.csv.legacy_quote_escaping is set to the default value of true, \ is used as an escape


character.

• A double quote must be in a quoted STRING and escaped, with either the escape character or a second
double quote.

Headers
If the CSV file starts with a header row containing column names, each import row in the file acts as a map
instead of an array.

You must indicate the presence of the header row by adding WITH HEADERS to the query. You can then
access specific fields by their corresponding column name.

79
Example 30. Parsing a CSV as a list of maps

artists-with-headers.csv

Id,Name,Year
1,ABBA,1992
2,Roxette,1986
3,Europe,1979
4,The Cardigans,1992

Query

LOAD CSV WITH HEADERS FROM 'file:///artists-with-headers.csv' AS row


MERGE (a:Artist {name: row.Name, year: toInteger(row.Year)})
RETURN
a.name AS name,
a.year AS year

Result

name year

"ABBA" 1992

"Roxette" 1986

"Europe" 1979

"The Cardigans" 1992

4 rows

Added 4 nodes, Set 8 properties, Added 4 labels

Field delimiter
The default field delimiter is ,. Use the FIELDTERMINATOR option to specify a different field delimiter.

If you try to import a file that doesn’t use , as field delimiter and you also don’t specify a custom delimiter,
LOAD CSV will interpret the CSV as having a single column.

80
Example 31. Import a CSV using ; as field delimiter

artists-fieldterminator.csv

1;ABBA;1992
2;Roxette;1986
3;Europe;1979
4;The Cardigans;1992

Query

LOAD CSV FROM 'file:///artists-fieldterminator.csv' AS row FIELDTERMINATOR ';'


MERGE (:Artist {name: row[1], year: toInteger(row[2])})

Result

Added 4 nodes, Set 8 properties, Added 4 labels

You can use the hexadecimal representation of the unicode character for the field

 delimiter if you prepend \u. Write the encoding with four digits: for example, \u003B is
equivalent to ; (semicolon).

Quotes escaping
Quoted STRING values are allowed in the CSV file and the quotes are dropped when LOAD CSV reads the
data. If quoted STRING values must contain quote characters ", there are two ways to escape them:

1. Double quotes — Use another quote " to escape a quote (for example, the CSV encoding of the STRING
The "Symbol" is "The ""Symbol""").

2. Prefix with backslash \ — If the configuration setting dbms.import.csv.legacy_quote_escaping is set


to true (the default value), \ works as the escape character for quotes (for example, the CSV encoding
of the STRING The "Symbol" is "The \"Symbol\"").

81
Example 32. Import a CSV with double-quotes escaping

artists-with-escaped-quotes.csv

"1","The ""Symbol""","1992"
"2","The \"Symbol\"","1992"

Query

LOAD CSV FROM 'file:///artists-with-escaped-quotes.csv' AS row


MERGE (a:Artist {id: toInteger(row[0]), name: row[1], year: toInteger(row[2])})
RETURN
a.id AS id,
a.name AS name,
a.year AS year,
size(a.name) AS size

Result

id name year size

1 'The "Symbol"' 1992 12

2 'The "Symbol"' 1992 12

Added 2 nodes, Set 6 properties, Added 2 labels

Note that name is a STRING, as it is wrapped in quotes in the output. The third column outputs the
STRING length as size. The length only counts what is between the outer quotes, but not the quotes
themselves.

Check source data quality


In case of a failed import, there are some elements to check to ensure the source file is not corrupted.

• Inconsistent headers — The CSV header may be inconsistent with the data. It can be missing, have too
many columns or have a different delimiter. Verify that the header matches the data in the file. Adjust
the formatting, delimiters or columns.

• Extra or missing quotes — Standalone double or single quotes in the middle of non-quoted text or
non-escaped quotes in quoted text can cause issues reading the file. Either escape or remove stray
quotes. See Quotes escaping.

• Special or newline characters — When dealing with special characters in a file, ensure they are quoted
or remove them.

• Inconsistent line breaks — Ensure line breaks are consistent throughout your file.

• Binary zeros, BOM byte order mark and other non-text characters — Unusual characters or tool-
specific formatting are sometimes hidden in application tools, but become apparent in plain-text
editors. If you come across these types of characters in your file, either remove them or use Cypher’s
normalize function.

82
Inspect source files ahead of import

Before importing data into the database, you can use LOAD CSV to inspect a source file and get an idea of
what form the imported data is going to have.

Example 33. Assert correct line count

// Assert correct line count


LOAD CSV FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS line
RETURN count(*);

Result

count(*)

445

1 row

Example 34. Check the first five lines with header sampling

// Check first 5 line-sample with header-mapping


LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS line
RETURN line.person_tmdbId, line.name
LIMIT 5;

Result

line.person_tmdbId line.name

'3' 'Harrison Ford'

'31' 'Tom Hanks'

'32' 'Robin Wright'

'35' 'Sally Field'

'48' 'Sean Bean'

5 rows

Example

83
Erase current database and import the full movie dataset

// Clear data
MATCH (n) DETACH DELETE n;

// Create constraints
CREATE CONSTRAINT Person_tmdbId IF NOT EXISTS
FOR (p:Person) REQUIRE p.tmdbId IS UNIQUE;

CREATE CONSTRAINT Movie_movieId IF NOT EXISTS


FOR (m:Movie) REQUIRE m.movieId IS UNIQUE;

// Create person nodes


LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET p.name = row.name, p.born = date(row.born);

// Create movie nodes


LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row
MERGE (m:Movie {id: toInteger(row.movieId)})
SET
m.title = row.title,
m.imdbId = toInteger(row.movie_imdbId),
m.languages = split(row.languages, '|'),
m.genres = split(row.genres, '|');

// Create relationships
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/acted_in.csv' AS row
MATCH (p:Person {tmdbId: toInteger(row.person_tmdbId)})
MATCH (m:Movie {id: toInteger(row.movieId)})
MERGE (p)-[r:ACTED_IN]->(m)
SET r.role = row.role;

// Set additional node label


MATCH (p:Person)-[:ACTED_IN]->()
WITH DISTINCT p
SET p:Actor;

Result

Added 1 constraints
Added 1 constraints
Added 444 nodes, Set 1332 properties, Added 444 labels
Added 93 nodes, Set 465 properties, Added 93 labels
Created 372 relationships, Set 372 properties
Added 353 labels

With increasing amounts of data, it is more efficient to create all nodes first, and then
 add relationships with a second pass.

Other ways of importing data


There are a few other tools to get CSV data into Neo4j.

1. The neo4j-admin database import command is the most efficient way of importing large CSV files.

2. Use a language library to parse CSV data and run creation Cypher queries against a Neo4j database.
Created as an extension library to provide common procedures and functions to developers. This
library is especially helpful for complex transformations and data manipulations. Useful procedures
include apoc.load.jdbc, apoc.load.json, and others.

3. The ETL Tool: allows to extract the schema from a relational database and turn it into a graph model. It

84
then takes care of importing the data into Neo4j.

4. The Kettle import tool maps and executes steps for the data process flow and works well for very large
datasets, especially if you are already familiar with using this tool.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/ads/data-analysis.adoc

MATCH
The MATCH clause enables you to define specific patterns that the database will search for within its graph
structure. The MATCH clause can specify the nodes, relationships, and properties in a pattern, allowing for
queries that traverse the graph to retrieve relevant data.

Example graph
The following graph is used for the examples below:

name: 'Michael Douglas' title: 'The American President'

ACTED_IN
Person role: 'President Andrew Shepherd'
Movie

'
IN ey
E D_ nern
T I
AC Mac
.J.
name: 'Martin Sheen' : 'A
r ole DIRECTED

Person
ACTED_IN
role: 'Gordon Gekko'
Person
'
IN
Fox
role TED_

name: 'Rob Reiner'


arl

Person
: 'C
AC

IN
D_ x' name: 'Charlie Sheen'
TE o
A C dF
e : 'Bu
rol

Movie DIRECTED Person

title: 'Wall Street' name: 'Oliver Stone'

To recreate the graph, run the following query against an empty Neo4j database:

85
CREATE (charlie:Person {name: 'Charlie Sheen'}),
(martin:Person {name: 'Martin Sheen'}),
(michael:Person {name: 'Michael Douglas'}),
(oliver:Person {name: 'Oliver Stone'}),
(rob:Person {name: 'Rob Reiner'}),
(wallStreet:Movie {title: 'Wall Street'}),
(charlie)-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet),
(martin)-[:ACTED_IN {role: 'Carl Fox'}]->(wallStreet),
(michael)-[:ACTED_IN {role: 'Gordon Gekko'}]->(wallStreet),
(oliver)-[:DIRECTED]->(wallStreet),
(thePresident:Movie {title: 'The American President'}),
(martin)-[:ACTED_IN {role: 'A.J. MacInerney'}]->(thePresident),
(michael)-[:ACTED_IN {role: 'President Andrew Shepherd'}]->(thePresident),
(rob)-[:DIRECTED]->(thePresident)

Find nodes
The MATCH clause allows you to specify node patterns of varying complexity to retrieve from a graph. For
more information about finding node patterns, see Patterns → Node patterns.

Find all nodes


By specifying a pattern with a single node and no labels, all nodes in the graph will be returned.

Find all nodes in a graph

MATCH (n)
RETURN n

Result

(:Person {"name":"Charlie Sheen"})

(:Person {"name":"Martin Sheen"})

(:Person {"name":"Michael Douglas"})

(:Person {"name":"Oliver Stone"})

(:Person {"name":"Rob Reiner"})

(:Movie {"title":"Wall Street"})

(:Movie {"title":"The American President"})

Rows: 7

Find nodes with a specific label


Find all nodes with the Movie label

MATCH (movie:Movie)
RETURN movie.title

Result

86
movie.title

"Wall Street"

"The American President"

Rows: 2

MATCH using node label expressions


Node pattern using the OR (|) label expression

MATCH (n:Movie|Person)
RETURN n.name AS name, n.title AS title

Result

name title

"Charlie Sheen" <null>

"Martin Sheen" <null>

"Michael Douglas" <null>

"Oliver Stone" <null>

"Rob Reiner" <null>

<null> "Wall Street"

<null> "The American President"

Rows: 7

Node pattern using negation (!) label expression

MATCH (n:!Movie)
RETURN labels(n) AS label, count(n) AS labelCount

 The above query uses the labels() and count() functions.

Result

label labelCount

["Person"] 5

Rows: 1

For a list of all label expressions supported by Cypher, see Patterns → Label expressions.

Find relationships
The MATCH clause allows you to specify relationship patterns of varying complexity to retrieve from a graph.
Unlike a node pattern, a relationship pattern cannot be used in a MATCH clause without node patterns at
both ends. For more information about relationship patterns, see Patterns → Relationship patterns.

87
Relationships will only be matched once inside a single pattern. Read more about this
 behavior in the section on relationship uniqueness.

Empty relationship patterns


By applying --, a pattern will be matched for a relationship with any direction and without any filtering on
relationship types or properties.

Find connected nodes using an empty relationship pattern

MATCH (:Person {name: 'Oliver Stone'})--(n)


RETURN n AS connectedNodes

Result

connectedNodes

(:Movie {title: "Wall Street"})

Rows: 1

Directed relationship patterns


The direction of a relationship in a pattern is indicated by arrows: --> or <--.

Find all nodes connected to Oliver Stone by an outgoing relationship.

MATCH (:Person {name: 'Oliver Stone'})-->(movie:Movie)


RETURN movie.title AS movieTitle

Result

movieTitle

"Wall Street"

Rows: 1

Relationship variables
It is possible to introduce a variable to a pattern, either for filtering on relationship properties or to return a
relationship.

Find the types of an aliased relationship

MATCH (:Person {name: 'Oliver Stone'})-[r]->()


RETURN type(r) AS relType

 The above query uses the type() function.

Result

88
relType

"DIRECTED"

Rows: 1

MATCH on an undirected relationship


When a pattern contains a bound relationship, and that relationship pattern does not specify direction,
Cypher will match the relationship in both directions.

Relationship pattern without direction

MATCH (a)-[:ACTED_IN {role: 'Bud Fox'}]-(b)


RETURN a, b

Result

a b

(:Movie {"title":"Wall Street"}) (:Person {"name":"Charlie Sheen"})

(:Person {"name":"Charlie Sheen"}) (:Movie {"title":"Wall Street"})

Rows: 2

Filter on relationship types


It is possible to specify the type of a relationship in a relationship pattern by using a colon (:) before the
relationship type.

Relationship pattern filtering on the ACTED_IN relationship type

MATCH (:Movie {title: 'Wall Street'})<-[:ACTED_IN]-(actor:Person)


RETURN actor.name AS actor

Result

actor

"Michael Douglas"

"Martin Sheen"

"Charlie Sheen"

Rows: 3

MATCH using relationship type expressions


It is possible to match a pattern containing one of several relationship types using the OR symbol, |.

Relationship pattern including either ACTED_IN or DIRECTED relationship types

MATCH (:Movie {title: 'Wall Street'})<-[:ACTED_IN|DIRECTED]-(person:Person)


RETURN person.name AS person

89
Result

person

"Oliver Stone"

"Michael Douglas"

"Martin Sheen"

"Charlie Sheen"

Rows: 4

As relationships can only have exactly one type each, ()-[:A&B]→() will never match a relationship.

For a list of all relationship type expressions supported by Cypher, see Patterns → Label expressions.

Find multiple relationships


A graph pattern can contain several relationship patterns.

Graph pattern including several relationship patterns

MATCH (:Person {name: 'Charlie Sheen'})-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(director:Person)


RETURN movie.title AS movieTitle, director.name AS director

Result

movieTitle director

"Wall Street" "Oliver Stone"

Rows: 1

MATCH with WHERE predicates


The MATCH clause is often paired with a WHERE sub-clause, which adds predicates to refine the patterns,
making them more specific. These predicates are part of the pattern itself, not just filters applied after
matching. Thus, always place the WHERE clause with its corresponding MATCH clause.

Simple WHERE predicate

MATCH (charlie:Person)-[:ACTED_IN]->(movie:Movie)
WHERE charlie.name = 'Charlie Sheen'
RETURN movie.title AS movieTitle

Result

movieTitle

"Wall Street"

Rows: 1

90
More complex WHERE predicate

MATCH (martin:Person)-[:ACTED_IN]->(movie:Movie)
WHERE martin.name = 'Martin Sheen' AND NOT EXISTS {
MATCH (movie)<-[:DIRECTED]-(director:Person {name: 'Oliver Stone'})
}
RETURN movie.title AS movieTitle

 The above query uses an EXISTS subquery.

Result

movieTitle

"The American President"

Rows: 1

For more information, see the WHERE page.

MATCH with parameters


The MATCH clause can be used with parameters.

Parameters

{
"movieTitle": "Wall Street",
"actorRole": "Fox"
}

Find nodes using paramters

MATCH (:Movie {title: $movieTitle})<-[r:ACTED_IN]-(p:Person)


WHERE r.role CONTAINS $actorRole
RETURN p.name AS actor, r.role AS role

 The above query uses the CONTAINS operator.

Result

actor role

"Charlie Sheen" "Bud Fox"

"Martin Sheen" "Carl Fox"

Rows: 2

For more information about how to set parameters, see Syntax → Parameters.

Find paths
The MATCH clause can also be used to bind whole paths to variables.

91
Find all paths matching a pattern

MATCH path = ()-[:ACTED_IN]->(movie:Movie)


RETURN path

Result

path

(:Person {name: "Charlie Sheen"})-[:ACTED_IN {role: "Bud Fox"}]→(:Movie {title: "Wall Street"})

(:Person {name: "Martin Sheen"})-[:ACTED_IN {role: "Carl Fox"}]→(:Movie {title: "Wall Street"})

(:Person {name: "Martin Sheen"})-[:ACTED_IN {role: "A.J. MacInerney"}]→(:Movie {title: "The American
President"})

(:Person {name: "Michael Douglas"})-[:ACTED_IN {role: "Gordon Gekko"}]→(:Movie {title: "Wall Street"})

(:Person {name: "Michael Douglas"})-[:ACTED_IN {role: "President Andrew Shepherd"}]→(:Movie {title: "The
American President"})

Rows: 5

Find paths matching a pattern including a WHERE predicate

MATCH path = (:Person)-[:ACTED_IN]->(movie:Movie)<-[:DIRECTED]-(:Person)


WHERE movie.title = 'Wall Street'
RETURN path

Result

path

(:Person {name: "Charlie Sheen"})-[:ACTED_IN {role: "Bud Fox"}]→(:Movie {title: "Wall Street"})←[:DIRECTED]-
(:Person {name: "Oliver Stone"})

(:Person {name: "Martin Sheen"})-[:ACTED_IN {role: "Carl Fox"}]→(:Movie {title: "Wall Street"})←[:DIRECTED]-
(:Person {name: "Oliver Stone"})

(:Person {name: "Michael Douglas"})-[:ACTED_IN {role: "Gordon Gekko"}]→(:Movie {title: "Wall


Street"})←[:DIRECTED]-(:Person {name: "Oliver Stone"})

Rows: 3

For more information about how MATCH is used to find patterns of varying complexity (including quantified
path patterns, quantified relationships, and the shortest paths between nodes), see the section on
Patterns.

Multiple MATCH clauses, the WITH clause, and clause composition


In Cypher, the behavior of a query is defined by its clauses. Each clause takes the current graph state and a
table of intermediate results, processes them, and passes the updated graph state and results to the next
clause. The first clause starts with the graph’s initial state and an empty table, while the final clause
produces the query result.

Chaining consecutive MATCH clauses

MATCH (:Person {name: 'Martin Sheen'})-[:ACTED_IN]->(movie:Movie) ①


MATCH (director:Person)-[:DIRECTED]->(movie) ②
RETURN director.name AS director, movie.title AS movieTitle

92
① The result of the first MATCH clause is the variable movie which holds all the Movies that Martin Sheen
has ACTED_IN.

② The second MATCH clause uses the movie variable to find any Person node with a DIRECTED relationship
to those Movie nodes that Martin Sheen has ACTED_IN.

Result

director movieTitle

"Oliver Stone" "Wall Street"

"Rob Reiner" "The American President"

Rows: 2

A variable can be implicitly carried over to the following clause by being referenced in another operation. A
variable can also be explicitly passed to the following clause using the WITH clause. If a variable is neither
implicitly nor explicitly carried over to its following clause, it will be discarded and is not available for
reference later in the query.

Using WITH and multiple MATCH clauses

MATCH (actors:Person)-[:ACTED_IN]->(movies:Movie) ①
WITH actors, count(movies) AS movieCount ②
ORDER BY movieCount DESC
LIMIT 1 ③
MATCH (actors)-[:ACTED_IN]->(movies) ④
RETURN actors.name AS actor, movieCount, collect(movies.title) AS movies

① The Person and Movie nodes matched in this step are stored in variables, which are then passed on to
the second row of the query.

② The movies variable is implicitly imported by its occurrence in the count() function. The WITH clause
explicitly imports the actors variable.

③ An ORDER BY clause orders the results by movieCount in descending order, ensuring that the Person with
the highest number of movies appears at the top, and LIMIT 1 ensures that all other Person nodes are
discarded.

④ The second MATCH clause finds all Movie nodes associated with the Person nodes currently bound to the
actors variable.

 The above query uses the collect() function.

Result

actor movieCount movies

"Martin Sheen" 2 ["Wall Street", "The American


President"]

Rows: 1

For more information about how Cypher queries work, see Clause composition.

93
MERGE

Introduction
The MERGE clause either matches existing node patterns in the graph and binds them or, if not present,
creates new data and binds that. In this way, it acts as a combination of MATCH and CREATE that allows for
specific actions depending on whether the specified data was matched or created.

For example, MERGE can be used to specify that a graph must contain a node with a Person label and a
specific name property. If there isn’t a node with the specific name property, a new node will be created with
that name property.

For performance reasons, creating a schema index on the label or property is highly

 recommended when using MERGE. See Create, show, and delete indexes for more
information.

When using MERGE on full patterns, the behavior is that either the whole pattern matches, or the whole
pattern is created. MERGE will not partially use existing patterns. If partial matches are needed, this can be
accomplished by splitting a pattern into multiple MERGE clauses.

Under concurrent updates, MERGE only guarantees the existence of the MERGE pattern, but
not uniqueness. To guarantee uniqueness of nodes with certain properties, a property
 uniqueness constraint should be used. See Using property uniqueness constraints with
MERGE.

Similar to MATCH, MERGE can match multiple occurrences of a pattern. If there are multiple matches, they will
all be passed on to later stages of the query.

The last part of a MERGE clause is the ON CREATE and/or ON MATCH operators. These allow a query to express
additional changes to the properties of a node or relationship, depending on whether the element was
matched (MATCH) in the database or if it was created (CREATE).

Example graph
The following graph is used for the examples below:

94
title: 'The American President'
name: 'Michael Douglas'
Movie ACTED_IN Person bornIn: 'New Jersey'
chauffeurName: 'John Brown'

DI name: 'Charlie Sheen'


R EC
TE bornIn: 'New York'
D chauffeurName: 'John Brown'

ACTED_IN
name: 'Rob Reiner'
ACTED_IN

Person bornIn: 'New York' Person


chauffeurName: 'Ted Green'

IN
D_
TE name: 'Oliver Stone'
name: 'Martin Sheen' AC bornIn: 'New York'
bornIn: 'Ohio' chauffeurName: 'Bill White'
chauffeurName: 'Bob Brown'

Person ACTED_IN Movie DIRECTED Person

title: 'Wall Street'

To recreate the graph, run the following query in an empty Neo4j database:

CREATE
(charlie:Person {name: 'Charlie Sheen', bornIn: 'New York', chauffeurName: 'John Brown'}),
(martin:Person {name: 'Martin Sheen', bornIn: 'Ohio', chauffeurName: 'Bob Brown'}),
(michael:Person {name: 'Michael Douglas', bornIn: 'New Jersey', chauffeurName: 'John Brown'}),
(oliver:Person {name: 'Oliver Stone', bornIn: 'New York', chauffeurName: 'Bill White'}),
(rob:Person {name: 'Rob Reiner', bornIn: 'New York', chauffeurName: 'Ted Green'}),
(wallStreet:Movie {title: 'Wall Street'}),
(theAmericanPresident:Movie {title: 'The American President'}),
(charlie)-[:ACTED_IN]->(wallStreet),
(martin)-[:ACTED_IN]->(wallStreet),
(michael)-[:ACTED_IN]->(wallStreet),
(martin)-[:ACTED_IN]->(theAmericanPresident),
(michael)-[:ACTED_IN]->(theAmericanPresident),
(oliver)-[:DIRECTED]->(wallStreet),
(rob)-[:DIRECTED]->(theAmericanPresident)

Merge nodes

Merge single node with a label


Merge a node with a specific label:

Query

MERGE (robert:Critic)
RETURN labels(robert)

A new node is created because there are no nodes labeled Critic in the database:

Result

labels(robert)

["Critic"]

95
Merge single node with multiple labels
Multiple labels are separated by colons:

Query

MERGE (robert:Critic:Viewer)
RETURN labels(robert)

A new node is created because there are no nodes labeled both Critic and Viewer in the database:

Result

labels(robert)

["Critic","Viewer"]

As of Neo4j 5.18, multiple labels can also be separated by an ampersand &, in the same manner as it is
used in label expressions. Separation by colon : and ampersand & cannot be mixed in the same clause.

Query

MERGE (robert:Critic&Viewer)
RETURN labels(robert)

No new node is created because there was already a node labeled both Critic and Viewer in the
database:

Result

labels(robert)

["Critic","Viewer"]

Merge single node with properties


Merging a node with properties that differ from the properties on existing nodes in the graph will create a
new node:

Query

MERGE (charlie {name: 'Charlie Sheen', age: 10})


RETURN charlie

A new node with the name Charlie Sheen is created since not all properties matched those set to the pre-
existing Charlie Sheen node:

Result

charlie

(:Person {"name":"Charlie Sheen", "age":10})

96
MERGE cannot be used for nodes with property values that are null. For example, the
following query will throw an error:

Query

 MERGE (martin:Person {name: 'Martin Sheen', age: null})


RETURN martin

Cannot merge the following node because of null property value for 'age': (:Person {age:
null})

Merge single node specifying both label and property


Merging a single node with both label and property matching an existing node will not create a new node:

Query

MERGE (michael:Person {name: 'Michael Douglas'})


RETURN michael.name, michael.bornIn

Michael Douglas is matched and the name and bornIn properties are returned:

Result

michael.name michael.bornIn

"Michael Douglas" "New Jersey"

Merge single node derived from an existing node property


It is possible to merge nodes using existing node properties:

Query

MATCH (person:Person)
MERGE (location:Location {name: person.bornIn})
RETURN person.name, person.bornIn, location

In the above query, three nodes labeled Location are created, each of which contains a name property with
the value of New York, Ohio, and New Jersey respectively. Note that even though the MATCH clause results in
three bound nodes having the value New York for the bornIn property, only a single New York node (i.e. a
Location node with a name of New York) is created. As the New York node is not matched for the first
bound node, it is created. However, the newly-created New York node is matched and bound for the
second and third bound nodes.

Result

person.name person.bornIn location

"Charlie Sheen" "New York" {name:"New York"}

"Martin Sheen" "Ohio" {name:"Ohio"}

"Michael Douglas" "New Jersey" {name:"New Jersey"}

97
person.name person.bornIn location

"Oliver Stone" "New York" {name:"New York"}

"Rob Reiner" "New York" {name:"New York"}

Use ON CREATE and ON MATCH

Merge with ON CREATE


Merge a node and set properties if the node needs to be created:

Query

MERGE (keanu:Person {name: 'Keanu Reeves', bornIn: 'Beirut', chauffeurName: 'Eric Brown'})
ON CREATE
SET keanu.created = timestamp()
RETURN keanu.name, keanu.created

The query creates the Person node named Keanu Reeves, with a bornIn property set to Beirut and a
chauffeurName property set to Eric Brown. It also sets a timestamp for the created property.

Result

keanu.name keanu.created

"Keanu Reeves" 1655200898563

Merge with ON MATCH


Merging nodes and setting properties on found nodes:

Query

MERGE (person:Person)
ON MATCH
SET person.found = true
RETURN person.name, person.found

The query finds all the Person nodes, sets a property on them, and returns them:

Result

person.name person.found

"Charlie Sheen" true

"Martin Sheen" true

"Michael Douglas" true

"Oliver Stone" true

"Rob Reiner" true

"Keanu Reeves" true

98
Merge with ON CREATE and ON MATCH
Query

MERGE (keanu:Person {name: 'Keanu Reeves'})


ON CREATE
SET keanu.created = timestamp()
ON MATCH
SET keanu.lastSeen = timestamp()
RETURN keanu.name, keanu.created, keanu.lastSeen

Because the Person node named Keanu Reeves already exists, this query does not create a new node.
Instead, it adds a timestamp on the lastSeen property.

Result

keanu.name keanu.created keanu.lastSeen

"Keanu Reeves" 1655200902354 1674655352124

Merge with ON MATCH setting multiple properties


If multiple properties should be set, separate them with commas:

Query

MERGE (person:Person)
ON MATCH
SET
person.found = true,
person.lastAccessed = timestamp()
RETURN person.name, person.found, person.lastAccessed

Result

person.name person.found person.lastAccessed

"Charlie Sheen" true 1655200903558

"Martin Sheen" true 1655200903558

"Michael Douglas" true 1655200903558

"Oliver Stone" true 1655200903558

"Rob Reiner" true 1655200903558

"Keanu Reeves" true 1655200903558

Merge relationships

Merge on a relationship
MERGE can be used to match or create a relationship:

99
Query

MATCH
(charlie:Person {name: 'Charlie Sheen'}),
(wallStreet:Movie {title: 'Wall Street'})
MERGE (charlie)-[r:ACTED_IN]->(wallStreet)
RETURN charlie.name, type(r), wallStreet.title

Charlie Sheen had already been marked as acting in Wall Street, so the existing relationship is found and
returned. Note that in order to match or create a relationship when using MERGE, at least one bound node
must be specified, which is done via the MATCH clause in the above example.

Result

charlie.name type(r) wallStreet.title

"Charlie Sheen" "ACTED_IN" "Wall Street"

MERGE cannot be used for relationships with property values that are null. For example,
the following query will throw an error:

Query

 MERGE (martin:Person {name: 'Martin Sheen'})-[r:FATHER_OF {since: null}]->(


charlie:Person {name: 'Charlie Sheen'})
RETURN type(r)

Cannot merge the following relationship because of null property value for 'since':
(martin)-[:FATHER_OF {since: null}]->(charlie)

As of Neo4j 5.20, specifying a property of an entity (node or relationship) by referring to


the property of another entity in the same MERGE clause is deprecated.

For example, referring to charlie.bornIn in the property definition of oliver.bornIn is


deprecated.

Query
 MERGE (charlie:Person {name: 'Charlie Sheen', bornIn: 'New York'})-[:ACTED_IN]->(
movie:Movie)<-[:DIRECTED]-(oliver:Person {name: 'Oliver Stone', bornIn: charlie.bornIn})
RETURN movie

Merging an entity (charlie) and referencing that entity in a property definition in the
same MERGE is deprecated.

Merge on multiple relationships


Query

MATCH
(oliver:Person {name: 'Oliver Stone'}),
(reiner:Person {name: 'Rob Reiner'})
MERGE (oliver)-[:DIRECTED]->(movie:Movie)<-[:DIRECTED]-(reiner)
RETURN movie

100
In the example graph, Oliver Stone and Rob Reiner have never worked together. When trying to MERGE a
Movie node between them, Neo4j will not use any of the existing Movie nodes already connected to either
person. Instead, a new Movie node is created.

Result

movie

(:Movie)

Merge on an undirected relationship


MERGE can also be used without specifying the direction of a relationship. Cypher will first try to match the
relationship in both directions. If the relationship does not exist in either direction, it will create one left to
right.

Query

MATCH
(charlie:Person {name: 'Charlie Sheen'}),
(oliver:Person {name: 'Oliver Stone'})
MERGE (charlie)-[r:KNOWS]-(oliver)
RETURN r

As Charlie Sheen and Oliver Stone do not know each other in the example graph, this MERGE query will
create a KNOWS relationship between them. The direction of the created relationship is left to right.

Result

[:KNOWS]

Merge on a relationship between two existing nodes


MERGE can be used in conjunction with preceding MATCH and MERGE clauses to create a relationship between
two bound nodes m and n, where m is returned by MATCH and n is created or matched by the earlier MERGE.

Query

MATCH (person:Person)
MERGE (location:Location {name: person.bornIn})
MERGE (person)-[r:BORN_IN]->(location)
RETURN person.name, person.bornIn, location

This builds on the example from Merge single node derived from an existing node property. The second
MERGE creates a BORN_IN relationship between each person and a location corresponding to the value of the
person’s bornIn property. Charlie Sheen, Rob Reiner, and Oliver Stone all have a BORN_IN relationship to
the same Location node (New York).

Result

person.name person.bornIn location

"Charlie Sheen" "New York" (:Location {name:"New York"})

101
person.name person.bornIn location

"Martin Sheen" "Ohio" (:Location {name:"Ohio"})

"Michael Douglas" "New Jersey" (:Location {name:"New Jersey"})

"Oliver Stone" "New York" (:Location {name:"New York"})

"Rob Reiner" "New York" (:Location {name:"New York"})

"Keanu Reeves" "Beirut" (:Location {name:"Beirut"})

Merge on a relationship between an existing node and a merged node derived


from a node property
MERGE can be used to simultaneously create both a new node n and a relationship between a bound node m
and n:

Query

MATCH (person:Person)
MERGE (person)-[r:HAS_CHAUFFEUR]->(chauffeur:Chauffeur {name: person.chauffeurName})
RETURN person.name, person.chauffeurName, chauffeur

As MERGE found no matches — in the example graph, there are no nodes labeled with Chauffeur and no
HAS_CHAUFFEUR relationships — MERGE creates six nodes labeled with Chauffeur, each of which contains a
name property whose value corresponds to each matched Person node’s chauffeurName property value.
MERGE also creates a HAS_CHAUFFEUR relationship between each Person node and the newly-created
corresponding Chauffeur node. As 'Charlie Sheen' and 'Michael Douglas' both have a chauffeur with
the same name — 'John Brown' — a new node is created in each case, resulting in two Chauffeur nodes
having a name of 'John Brown', correctly denoting the fact that even though the name property may be
identical, these are two separate people. This is in contrast to the example shown above in Merge on a
relationship between two existing nodes, where the first MERGE was used to bind the Location nodes and
to prevent them from being recreated (and thus duplicated) on the second MERGE.

Result

person.name person.chauffeurName chauffeur

"Charlie Sheen" "John Brown" (:Person {name:"John Brown"})

"Martin Sheen" "Bob Brown" (:Person {name:"Bob Brown"})

"Michael Douglas" "John Brown" (:Person {name:"John Brown"})

"Oliver Stone" "Bill White" (:Person {name:"Bill White"})

"Rob Reiner" "Ted Green" (:Person {name:"Ted Green"})

"Keanu Reeves" "Eric Brown" (:Person {name:"Eric Brown"})

Using node property uniqueness constraints with MERGE


Cypher prevents getting conflicting results from MERGE when using patterns that involve property
uniqueness constraints. In this case, there must be at most one node that matches that pattern.

For example, given two property node uniqueness constraints on :Person(id) and :Person(ssn), a query

102
such as MERGE (n:Person {id: 12, ssn: 437}) will fail, if there are two different nodes (one with id 12
and one with ssn 437), or if there is only one node with only one of the properties. In other words, there
must be exactly one node that matches the pattern, or no matching nodes.

Note that the following examples assume the existence of property uniqueness constraints that have been
created using:

CREATE CONSTRAINT FOR (n:Person) REQUIRE n.name IS UNIQUE;


CREATE CONSTRAINT FOR (n:Person) REQUIRE n.role IS UNIQUE;

Merge node using property uniqueness constraints creates a new node if no


node is found
Given the node property uniqueness constraint on the name property for all Person nodes, the below query
will create a new Person with the name property Laurence Fishburne. If a Laurence Fishburne node had
already existed, MERGE would match the existing node instead.

Query

MERGE (laurence:Person {name: 'Laurence Fishburne'})


RETURN laurence.name

Result

laurence.name

"Laurence Fishburne"

Merge using node property uniqueness constraints matches an existing node


Given property uniqueness constraint on the name property for all Person nodes, the below query will
match the pre-existing Person node with the name property Oliver Stone.

Query

MERGE (oliver:Person {name: 'Oliver Stone'})


RETURN oliver.name, oliver.bornIn

Result

oliver.name oliver.bornIn

"Oliver Stone" "New York"

Merge with property uniqueness constraints and partial matches


Merge using property uniqueness constraints fails when finding partial matches:

Query

MERGE (michael:Person {name: 'Michael Douglas', role: 'Gordon Gekko'})


RETURN michael

103
While there is a matching unique Person node with the name Michael Douglas, there is no unique node
with the role of Gordon Gekko and MERGE, therefore, fails to match.

Error message

Node already exists with label `Person` and property `name` = 'Michael Douglas'

To set the role of Gordon Gekko to Michael Douglas, use the SET clause instead:

Query

MERGE (michael:Person {name: 'Michael Douglas'})


SET michael.role = 'Gordon Gekko'

Result

Set 1 property

Merge with property uniqueness constraints and conflicting matches


Merge using property uniqueness constraints fails when finding conflicting matches:

Query

MERGE (oliver:Person {name: 'Oliver Stone', role: 'Gordon Gekko'})


RETURN oliver

While there is a matching unique Person node with the name Oliver Stone, there is also another unique
Person node with the role of Gordon Gekko and MERGE fails to match.

Error message

Node already exists with label `Person` and property `name` = 'Oliver Stone'

Using relationship property uniqueness constraints with MERGE


All that has been said above about node uniqueness constraints also applies to relationship uniqueness
constraints. However, for relationship uniqueness constraints there are some additional things to consider.

For example, if there exists a relationship uniqueness constraint on ()-[:ACTED_IN(year)]-(), then the
following query, in which not all nodes of the pattern are bound, would fail:

Query

MERGE (charlie:Person {name: 'Charlie Sheen'})-[r:ACTED_IN {year: 1987}]->(wallStreet:Movie {title: 'Wall


Street'})
RETURN charlie.name, type(r), wallStreet.title

This is due to the all-or-nothing semantics of MERGE, which causes the query to fail if there exists a
relationship with the given year property but there is no match for the full pattern. In this example, since no
match was found for the pattern, MERGE will try to create the full pattern including a relationship with

104
{year: 1987}, which will lead to constraint violation error.

Therefore, it is advised - especially when relationship uniqueness constraints exist - to always use bound
nodes in the MERGE pattern. The following would, therefore, be a more appropriate composition of the
query:

Query

MATCH
(charlie:Person {name: 'Charlie Sheen'}),
(wallStreet:Movie {title: 'Wall Street'})
MERGE (charlie)-[r:ACTED_IN {year: 1987}]->(wallStreet)
RETURN charlie.name, type(r), wallStreet.title

Using map parameters with MERGE


MERGE does not support map parameters the same way that CREATE does. To use map parameters with
MERGE, it is necessary to explicitly use the expected properties, such as in the following example. For more
information on parameters, see Parameters.

Parameters

{
"param": {
"name": "Keanu Reeves",
"bornIn": "Beirut",
"chauffeurName": "Eric Brown"
}
}

Query

MERGE (person:Person {name: $param.name, bornIn: $param.bornIn, chauffeurName: $param.chauffeurName})


RETURN person.name, person.bornIn, person.chauffeurName

Result

person.name person.bornIn person.chauffeurName

"Keanu Reeves" "Beirut" "Eric Brown"

OPTIONAL MATCH

Introduction
OPTIONAL MATCH matches patterns against a graph database, just as MATCH does. The difference is that if no
matches are found, OPTIONAL MATCH will use a null for missing parts of the pattern. OPTIONAL MATCH could
therefore be considered the Cypher equivalent of the outer join in SQL.

When using OPTIONAL MATCH, either the whole pattern is matched, or nothing is matched. The WHERE clause
is part of the pattern description, and its predicates will be considered while looking for matches, not after.
This matters especially in the case of multiple (OPTIONAL) MATCH clauses, where it is crucial to put WHERE
together with the MATCH it belongs to.

105
 To understand the patterns used in the OPTIONAL MATCH clause, read Patterns.

Example graph
The following graph is used for the examples below:

name: 'Michael Douglas' title: 'The American President'

Person ACTED_IN Movie

_IN
T ED
name: 'Martin Sheen' AC

DIRECTED
Person
ACTED_IN

FA
TH
ER
_O
F Person
D_IN
E

name: 'Rob Reiner'


ACT

Person

_IN name: 'Charlie Sheen'


TED
AC

Movie DIRECTED Person

title: 'Wall Street' name: 'Oliver Stone'

To recreate the graph, run the following query in an empty Neo4j database:

106
CREATE
(charlie:Person {name: 'Charlie Sheen'}),
(martin:Person {name: 'Martin Sheen'}),
(michael:Person {name: 'Michael Douglas'}),
(oliver:Person {name: 'Oliver Stone'}),
(rob:Person {name: 'Rob Reiner'}),
(wallStreet:Movie {title: 'Wall Street'}),
(charlie)-[:ACTED_IN]->(wallStreet),
(martin)-[:ACTED_IN]->(wallStreet),
(michael)-[:ACTED_IN]->(wallStreet),
(oliver)-[:DIRECTED]->(wallStreet),
(thePresident:Movie {title: 'The American President'}),
(martin)-[:ACTED_IN]->(thePresident),
(michael)-[:ACTED_IN]->(thePresident),
(rob)-[:DIRECTED]->(thePresident),
(martin)-[:FATHER_OF]->(charlie)

OPTIONAL MATCH in more detail


Like SQL, Cypher queries are constructed using various clauses which are chained together to feed
intermediate results between each other. For example, the matching variables from one MATCH clause will
provide the context in which the next clause exists. However, there are two important differences between
Neo4j and SQL which helps to explain OPTIONAL MATCH further.

1. While it is both possible and advised to enforce partial schemas using indexes and constraints, Neo4j
offers a greater degree of schema flexibility than a relational database. Nodes and relationships in a
Neo4j database do not have to have a specific property set to them because other nodes or
relationships in the same graph have that property (unless there is a property existence constraint
created on the specific property).

2. Queries in Cypher are run as pipelines. If a clause returns no results, it will effectively end the query as
subsequent clauses will have no data to execute upon.

For example, the following query returns no results:

MATCH (a:Person {name: 'Martin Sheen'})


MATCH (a)-[r:DIRECTED]->()
RETURN a.name, r

(no changes, no records)

This is because the second MATCH clause returns no data (there are no DIRECTED relationships connected to
Martin Sheen in the graph) to pass on to the RETURN clause.

However, replacing the second MATCH clause with OPTIONAL MATCH does return results. This is because,
unlike MATCH, OPTIONAL MATCH enables the value null to be passed between clauses.

MATCH (p:Person {name: 'Martin Sheen'})


OPTIONAL MATCH (p)-[r:DIRECTED]->()
RETURN p.name, r

Result

107
p.name r

"Martin Sheen" <null>

Rows: 1

OPTIONAL MATCH can therefore be used to check graphs for missing as well as existing values, and to pass
on rows without any data to subsequent clauses in a query.

Optional relationships
If the existence of a relationship is optional, use the OPTIONAL MATCH clause. If the relationship exists, it is
returned. If it does not, null is returned in its place.

MATCH (a:Movie {title: 'Wall Street'})


OPTIONAL MATCH (a)-->(x)
RETURN x

Returns null, since the Movie node Wall Street has no outgoing relationships.

Result

<null>

Rows: 1

On the other hand, the following query does not return null since the Person node Charlie Sheen has one
outgoing relationship.

MATCH (a:Person {name: 'Charlie Sheen'})


OPTIONAL MATCH (a)-->(x)
RETURN x

Result

{"title":"Wall Street"}

Rows: 2

Properties on optional elements


If the existence of a property is optional, use the OPTIONAL MATCH clause. null will be returned if the
specified property does not exist.

MATCH (a:Movie {title: 'Wall Street'})


OPTIONAL MATCH (a)-->(x)
RETURN x, x.name

Returns the element x (null in this query), and null for its name property, because the Movie node Wall
Street has no outgoing relationships.

108
Result

x x.name

<null> <null>

Rows: 1

The following query only returns null for the nodes which lack a name property.

MATCH (a:Person {name: 'Martin Sheen'})


OPTIONAL MATCH (a)-->(x)
RETURN x, x.name

Result

x x.name

{"title":"Wall Street"} <null>

{"name":"Charlie Sheen"} "Charlie Sheen"

{"title":"The American President"} <null>

Rows: 3

Optional typed and named relationship


It is also possible to look for specific relationship types when using OPTIONAL MATCH:

MATCH (a:Movie {title: 'Wall Street'})


OPTIONAL MATCH (a)-[r:ACTED_IN]->()
RETURN a.title, r

This returns the title of the Movie node Wall Street, and since this node has no outgoing ACTED_IN
relationships, null is returned for the relationship denoted by the variable r.

Result

a.title r

"Wall Street" <null>

Rows: 1

However, the following query does not return null since it is looking for incoming relationships of the type
ACTED_IN to the Movie node Wall Street.

MATCH (a:Movie {title: 'Wall Street'})


OPTIONAL MATCH (x)-[r:ACTED_IN]->(a)
RETURN a.title, x.name, type(r)

Result

a.title x.name type(r)

"Wall Street" "Michael Douglas" "ACTED_IN"

109
a.title x.name type(r)

"Wall Street" "Martin Sheen" "ACTED_IN"

"Wall Street" "Charlie Sheen" "ACTED_IN"

Rows: 3

ORDER BY
ORDER BY specifies how the output of a clause should be sorted. It be used as a sub-clause following
RETURN or WITH. As of Neo4j 5.24, it can also be used as a standalone clause, either on its own or in
combination with SKIP/OFFSET or LIMIT.

ORDER BY relies on comparisons to sort the output, see Ordering and comparison of values. You can sort on
many different values, e.g. node/relationship properties, the node/relationship ids, or on most expressions.

 Unless ORDER BY is used, Neo4j does not guarantee the row order of a query result.

Example graph
The following graph is used for the examples below:

Person KNOWS Person KNOWS Person

name: 'Andy' name: 'Bernard' name: 'Charlotte'


age: 34 age: 36 age: 32
length: 170 length: 185

To recreate it, run the following query against an empty Neo4j database:

CREATE
(andy: Person {name: 'Andy', age: 34, length: 170}),
(bernard: Person {name: 'Bernard', age: 36}),
(charlotte: Person {name: 'Charlotte', age: 32, length: 185}),
(andy)-[:KNOWS]->(bernard),
(bernard)-[:KNOWS]->(charlotte)

Order nodes by property


ORDER BY is used to sort the output.

Query

MATCH (n)
RETURN n.name, n.age
ORDER BY n.name

The nodes are returned, sorted by their name.

110
Result

n.name n.age

"Andy" 34

"Bernard" 36

"Charlotte" 32

Rows: 3

Order nodes by multiple properties


You can order by multiple properties by stating each variable in the ORDER BY clause. Cypher will sort the
result by the first variable listed, and for equals values, go to the next property in the ORDER BY clause, and
so on.

Query

MATCH (n)
RETURN n.name, n.age
ORDER BY n.age, n.name

This returns the nodes, sorted first by their age, and then by their name.

Result

n.name n.age

"Charlotte" 32

"Andy" 34

"Bernard" 36

Rows: 3

Order nodes by ID
ORDER BY is used to sort the output.

Query

MATCH (n)
RETURN n.name, n.age
ORDER BY elementId(n)

The nodes are returned, sorted by their internal ID.

Result

n.name n.age

"Andy" 34

"Bernard" 36

"Charlotte" 32

111
n.name n.age

Rows: 3

Neo4j reuses its internal IDs when nodes and relationships are deleted. Applications

 relying on internal Neo4j IDs are, as a result, brittle and can be inaccurate. It is
recommended to use application-generated IDs instead.

Order nodes by expression


ORDER BY is used to sort the output.

Query

MATCH (n)
RETURN n.name, n.age, n.length
ORDER BY keys(n)

The nodes are returned, sorted by their properties.

Result

n.name n.age n.length

"Bernard" 36 <null>

"Andy" 34 170

"Charlotte" 32 185

Rows: 3

Order nodes in descending order


By adding DESC[ENDING] after the variable to sort on, the sort will be done in reverse order.

Query

MATCH (n)
RETURN n.name, n.age
ORDER BY n.name DESC

The example returns the nodes, sorted by their name in reverse order.

Result

n.name n.age

"Charlotte" 32

"Bernard" 36

"Andy" 34

Rows: 3

112
Ordering null
When sorting the result set, null will always come at the end of the result set for ascending sorting, and
first when doing descending sort.

Query

MATCH (n)
RETURN n.length, n.name, n.age
ORDER BY n.length

The nodes are returned sorted by the length property, with a node without that property last.

Result

n.length n.name n.age

170 "Andy" 34

185 "Charlotte" 32

<null> "Bernard" 36

Rows: 3

Ordering in a WITH clause


When ORDER BY is present on a WITH clause , the immediately following clause will receive records in the
specified order. The order is not guaranteed to be retained after the following clause, unless that also has
an ORDER BY subclause. The ordering guarantee can be useful to exploit by operations which depend on
the order in which they consume values. For example, this can be used to control the order of items in the
list produced by the collect() aggregating function. The MERGE and SET clauses also have ordering
dependencies which can be controlled this way.

Query

MATCH (n)
WITH n ORDER BY n.age
RETURN collect(n.name) AS names

The list of names built from the collect aggregating function contains the names in order of the age
property.

Result

names

["Charlotte","Andy","Bernard"]

Rows: 1

Ordering aggregated or DISTINCT results


In terms of scope of variables, ORDER BY follows special rules, depending on if the projecting RETURN or WITH
clause is either aggregating or DISTINCT. If it is an aggregating or DISTINCT projection, only the variables

113
available in the projection are available. If the projection does not alter the output cardinality (which
aggregation and DISTINCT do), variables available from before the projecting clause are also available.
When the projection clause shadows already existing variables, only the new variables are available.

It is also not allowed to use aggregating expressions in the ORDER BY sub-clause if they are not also listed
in the projecting clause. This rule is to make sure that ORDER BY does not change the results, only the order
of them.

ORDER BY and indexes


The performance of Cypher queries using ORDER BY on node properties can be influenced by the existence
and use of an index for finding the nodes. If the index can provide the nodes in the order requested in the
query, Cypher can avoid the use of an expensive Sort operation. Read more about this capability in Range
index-backed ORDER BY.

Using ORDER BY as a standalone clause Label—new 5.24


ORDER BY can be used as a standalone clause, or in conjunction with SKIP/OFFSET or LIMIT.

Standalone use of ORDER BY

MATCH (n)
ORDER BY n.name
RETURN collect(n.name) AS names

Result

names

["Andy", "Bernard", "Charlotte"]

Rows: 1

The following query orders all nodes by name descending, skips the first row and limits the results to one
row.

ORDER BY used in conjunction with SKIP and LIMIT

MATCH (n)
ORDER BY n.name DESC
SKIP 1
LIMIT 1
RETURN n.name AS name

Result

name

"Bernard"

Rows: 1

114
REMOVE
The REMOVE clause is used to remove properties from nodes and relationships, and to remove labels from
nodes.

 For deleting nodes and relationships, see DELETE.

Removing labels from a node is an idempotent operation: if you try to remove a label

 from a node that does not have that label on it, nothing happens. The query statistics will
tell you if something needed to be done or not.

Example graph
The following graph is used for the examples below:

name: 'Andy'
Swedish age: 36
propTestValue2: 42
KN
S
OW

OW
KN

name: 'Timothy' Swedish name: 'Peter'


age: 25 Swedish age: 34
propTestValue2: 42 German

To recreate it, run the following query against an empty Neo4j database:

CREATE
(a:Swedish {name: 'Andy', age: 36, propTestValue1: 42}),
(t:Swedish {name: 'Timothy', age: 25, propTestValue2: 42}),
(p:German:Swedish {name: 'Peter', age: 34}),
(a)-[:KNOWS]->(t),
(a)-[:KNOWS]->(p)

Remove a property
Neo4j doesn’t allow storing null in properties. Instead, if no value exists, the property is just not there. So,
REMOVE is used to remove a property value from a node or a relationship.

115
Query

MATCH (a {name: 'Andy'})


REMOVE a.age
RETURN a.name, a.age

The node is returned, and no property age exists on it.

Result

a.name a.age

"Andy" <null>

Rows: 1
Properties set: 1

Remove all properties


REMOVE cannot be used to remove all existing properties from a node or relationship. Instead, using SET
with = and an empty map as the right operand will clear all properties from the node or relationship.

Dynamically removing a property Label—new 5.24


REMOVE can be used to remove a property on a node or relationship even when the property key name is
not statically known.

REMOVE n[key]

The dynamically calculated key must evaluate to a STRING value. This query creates a copy of every
property on the nodes:

Query

MATCH (n)
WITH n, [k IN keys(n) WHERE k CONTAINS "Test" | k] as propertyKeys ①
FOREACH (i IN propertyKeys | REMOVE n[i]) ②
RETURN n.name, keys(n);

① The keys() function retrieves all property keys of the matched nodes, and a list comprehension filters
these keys to include only those that contain the substring "Test", assigning the resulting list to the
variable propertyKeys.

② The FOREACH clause iterates over each key in the propertyKeys list and removes the corresponding
property using the REMOVE clause.

All properties with the word "Test" in them are removed:

Result

n.name keys(n)

"Andy" ["name", "age"]

"Timothy" ["name", "age"]

116
n.name keys(n)

"Peter" ["name", "age"]

Rows: 3
Properties set: 2

Remove a label from a node


To remove labels, you use REMOVE.

Query

MATCH (n {name: 'Peter'})


REMOVE n:German
RETURN n.name, labels(n)

Result

n.name labels(n)

"Peter" ["Swedish"]

Rows: 1
Labels removed: 1

Dynamically removing a label Label—new 5.24


REMOVE can be used to remove a label on a node even when the label is not statically known.

MATCH (n)
REMOVE n:$(expr)

Query

MATCH (n {name: 'Peter'})


UNWIND labels(n) AS label ①
REMOVE n:$(label)
RETURN n.name, labels(n)

① UNWIND is used here to transform the list of labels from the labels() function into separate rows,
allowing subsequent operations to be performed on each label individually.

Result

n.name labels(n)

"Peter" []

Rows: 1
Labels removed: 2

Remove multiple labels from a node


To remove multiple labels, you use REMOVE.

117
Query

MATCH (n {name: 'Peter'})


REMOVE n:German:Swedish
RETURN n.name, labels(n)

Result

n.name labels(n)

"Peter" []

Rows: 1
Labels removed: 2

Remove multiple labels dynamically from a node Label—new 5.24


It is possible to remove multiple labels dynamically using a LIST<STRING> and/or by chaining them
separately with a ::

Query

MATCH (n {name: 'Peter'})


REMOVE n:$(labels(n))
RETURN n.name, labels(n)

Result

n.name labels(n)

"Peter" []

Rows: 1
Labels removed: 2

RETURN

Introduction
The RETURN clause defines the parts of a pattern (nodes, relationships, and/or properties) to be included in
the query result.

Example graph
The following graph is used for the examples below:

118
To recreate the graph, run the following query against an empty Neo4j database.

CREATE
(keanu:Person {name: 'Keanu Reeves', bornIn: 'Beirut', nationality: 'Canadian'}),
(taiChi:Movie {title: 'Man of Tai Chi', released: 2013}),
(keanu)-[:ACTED_IN]->(taiChi),
(keanu)-[:DIRECTED]->(taiChi)

Return nodes
To return a node, list it in the RETURN clause:

Query

MATCH (p:Person {name: 'Keanu Reeves'})


RETURN p

Result

{"bornIn":"Beirut","nationality":"Canadian","name":"Keanu Reeves"}

Rows: 1

Return relationships
To return a relationship type, list it in the RETURN clause:

Query

MATCH (p:Person {name: 'Keanu Reeves'})-[r:ACTED_IN]->(m)


RETURN type(r)

Result

type(r)

"ACTED_IN"

Rows: 1

119
Return property
To return a specific property, use the dot separator:

Query

MATCH (p:Person {name: 'Keanu Reeves'})


RETURN p.bornIn

Result

p.bornIn

"Beirut"

Rows: 1

To only return the value of a property, do not not return the full node/relationship. This
 will improve performance.

Return all elements


To return all nodes, relationships and paths found in a query, use the * symbol:

Query

MATCH p = (keanu:Person {name: 'Keanu Reeves'})-[r]->(m)


RETURN *

This returns the two nodes, and the two possible paths between them.

Result

keanu m p r

{"bornIn":"Beirut","nation {"title":"Man of Tai (:Person {bornIn: {:ACTED_IN}


ality":"Canadian","name":" Chi","released":2013} "Beirut",nationality:
Keanu Reeves"} "Canadian",name: "Keanu
Reeves"})-
[:ACTED_IN]→(:Movie
{title: "Man of Tai
Chi",released: 2013})

{"bornIn":"Beirut","nation {"title":"Man of Tai (:Person {bornIn: {:DIRECTED}


ality":"Canadian","name":" Chi","released":2013} "Beirut",nationality:
Keanu Reeves"} "Canadian",name: "Keanu
Reeves"})-
[:DIRECTED]→(:Movie
{title: "Man of Tai
Chi",released: 2013})

Rows: 1

Variable with uncommon characters


To introduce a variable made up of characters not contained in the English alphabet, use ` to enclose the
variable:

120
Query

MATCH (`/uncommon variable\`)


WHERE `/uncommon variable\`.name = 'Keanu Reeves'
RETURN `/uncommon variable\`.bornIn

The bornIn property of the node with the name property set to 'Keanu Reeves' is returned:

Result

/uncommon variable\.bornIn

"Beirut"

Rows: 1

Column alias
Names of returned columns can be renamed using the AS operator:

Query

MATCH (p:Person {name: 'Keanu Reeves'})


RETURN p.nationality AS citizenship

Returns the nationality property of 'Keanu Reeves', but the column is renamed to citizenship.

Result

citizenship

"Canadian"

Rows: 1

Optional properties
If the existence of a property is unknown, it can still be included in a RETURN clause. It will be treated as
null if it is missing.

Query

MATCH (n)
RETURN n.bornIn

This example returns the bornIn properties for nodes that has that property, and null for those nodes
missing the property.

Result

n.bornIn

"Beirut"

<null>

121
n.bornIn

Rows: 2

Other expressions
Any expression can be used as a return item — literals, predicates, properties, functions, and so on.

Query

MATCH (m:Movie {title: 'Man of Tai Chi'})


RETURN m.released < 2012, "I'm a literal",[p=(m)--() | p] AS `(m)--()`

Returns a predicate, a literal and function call with a pattern expression parameter:

Result

m.released < 2012 "I’m a literal" (m)--()

false "I’m a literal" [(:Movie {title: "Man of Tai


Chi",released: 2013})←[:DIRECTED]-
(:Person {bornIn:
"Beirut",nationality:
"Canadian",name: "Keanu Reeves"}),
(:Movie {title: "Man of Tai
Chi",released: 2013})←[:ACTED_IN]-
(:Person {bornIn:
"Beirut",nationality:
"Canadian",name: "Keanu Reeves"})]

Rows: 1

Unique results
DISTINCT retrieves only unique rows for the columns that have been selected for output.

Query

MATCH (p:Person {name: 'Keanu Reeves'})-->(m)


RETURN DISTINCT m

The Movie node 'Man of Tai Chi' is returned by the query, but only once (without the DISTINCT operator it
would have been returned twice because there are two relationships going to it from 'Keanu Reeves'):

Result

{"title":"Man of Tai Chi","released":2013}+

Rows: 1

SET
The SET clause is used to update labels on nodes and properties on nodes and relationships.

122
The SET clause can be used with a map — provided as a literal or a parameter — to set properties.

Setting labels on a node is an idempotent operation — nothing will occur if an attempt is

 made to set a label on a node that already has that label. The query statistics will state
whether any updates actually took place.

Example graph
The following graph is used for the examples below:

name: 'Stefan'

S
OW
K N

name: 'Andy'
Swedish age: 36 name: 'George'
hungry: true

KN S
OW W
S KNO

name: 'Peter'
age: 34

To recreate it, run the following query against an empty Neo4j database:

CREATE
(a:Swedish {name: 'Andy', age: 36, hungry: true}),
(b {name: 'Stefan'}),
(c {name: 'Peter', age: 34}),
(d {name: 'George'}),
(a)-[:KNOWS]->(c),
(b)-[:KNOWS]->(a),
(d)-[:KNOWS]->(c)

Set a property
Update a node property:

Query

MATCH (n {name: 'Andy'})


SET n.surname = 'Taylor'
RETURN n.name, n.surname

123
The newly-changed node is returned by the query.

Result

n.name n.surname

"Andy" "Taylor"

Rows: 1
Properties set: 1

Update a relationship property:

Query

MATCH (n:Swedish {name: 'Andy'})-[r:KNOWS]->(m)


SET r.since = 1999
RETURN r, m.name AS friend

Result

r friend

[:KNOWS {since: 1999}] "Peter"

Rows: 1
Properties set: 1

It is possible to set a property on a node or relationship using more complex expressions. For instance, in
contrast to specifying the node directly, the following query shows how to set a property for a node
selected by an expression:

Query

MATCH (n {name: 'Andy'})


SET (CASE WHEN n.age = 36 THEN n END).worksIn = 'Malmo'
RETURN n.name, n.worksIn

Result

n.name n.worksIn

"Andy" "Malmo"

Rows: 1
Properties set: 1

No action will be taken if the node expression evaluates to null, as shown in this example:

Query

MATCH (n {name: 'Andy'})


SET (CASE WHEN n.age = 55 THEN n END).worksIn = 'Malmo'
RETURN n.name, n.worksIn

As no node matches the CASE expression, the expression returns a null. As a consequence, no updates
occur, and therefore no worksIn property is set.

124
Result

n.name n.worksIn

"Andy" <null>

Rows: 1

Update a property
SET can be used to update a property on a node or relationship. This query forces a change of type in the
age property:

Query

MATCH (n {name: 'Andy'})


SET n.age = toString(n.age)
RETURN n.name, n.age

The age property has been converted to the STRING '36'.

Result

n.name n.age

"Andy" "36"

Rows: 1
Properties set: 1

Dynamically setting or updating a property Label—new 5.24


SET can be used to set or update a property on a node or relationship even when the property key name is
not statically known.

SET n[key] = expression

The dynamically calculated key must evaluate to a STRING value. This query creates a copy of every
property on the nodes:

Query

MATCH (n)
FOREACH (k IN keys(n) | SET n[k + "Copy"] = n[k]) ①
RETURN n.name, keys(n);

① The FOREACH clause iterates over each property key k obtained from the keys() function. For each key, it
sets a new property on the nodes with a key name of k + "Copy" and copies the value from the original
property.

The nodes now have copies of all their properties.

Result

125
n.name keys(n)

"Andy" ["name", "nameCopy", "age", "ageCopy", "hungry",


"hungryCopy"]

"Stefan" ["name", "nameCopy"]

"Peter" ["name", "nameCopy", "age", "ageCopy"]

"George" ["name", "nameCopy"]

Rows: 4
Properties set: 6

Remove a property
Although REMOVE is normally used to remove a property, it is sometimes convenient to do it using the SET
command. A case in point is if the property is provided by a parameter.

Query

MATCH (n {name: 'Andy'})


SET n.name = null
RETURN n.name, n.age

The name property is now missing.

Result

n.name n.age

<null> "36"

Rows: 1
Properties set: 1

Copy properties between nodes and relationships


SET can be used to copy all properties from one node or relationship to another using the properties()
function. This will remove all other properties on the node or relationship being copied to.

Query

MATCH
(at {name: 'Andy'}),
(pn {name: 'Peter'})
SET at = properties(pn)
RETURN at.name, at.age, at.hungry, pn.name, pn.age

The 'Andy' node has had all its properties replaced by the properties of the 'Peter' node.

Result

at.name at.age at.hungry pn.name pn.age

"Peter" 34 <null> "Peter" 34

126
at.name at.age at.hungry pn.name pn.age

Rows: 1
Properties set: 3

Replace all properties using a map and =


The property replacement operator = can be used with SET to replace all existing properties on a node or
relationship with those provided by a map:

Query

MATCH (p {name: 'Peter'})


SET p = {name: 'Peter Smith', position: 'Entrepreneur'}
RETURN p.name, p.age, p.position

This query updated the name property from Peter to Peter Smith, deleted the age property, and added the
position property to the 'Peter' node.

Result

p.name p.age p.position

"Peter Smith" <null> "Entrepreneur"

Rows: 1
Properties set: 3

Remove all properties using an empty map and =


All existing properties can be removed from a node or relationship by using SET with = and an empty map
as the right operand:

Query

MATCH (p {name: 'Peter'})


SET p = {}
RETURN p.name, p.age

This query removed all the existing properties — namely, name and age — from the 'Peter' node.

Result

p.name p.age

<null> <null>

Rows: 1
Properties set: 2

Mutate specific properties using a map and +=


The property mutation operator += can be used with SET to mutate properties from a map in a fine-grained
fashion:

127
• Any properties in the map that are not on the node or relationship will be added.

• Any properties not in the map that are on the node or relationship will be left as is.

• Any properties that are in both the map and the node or relationship will be replaced in the node or
relationship. However, if any property in the map is null, it will be removed from the node or
relationship.

Query

MATCH (p {name: 'Peter'})


SET p += {age: 38, hungry: true, position: 'Entrepreneur'}
RETURN p.name, p.age, p.hungry, p.position

This query left the name property unchanged, updated the age property from 34 to 38, and added the hungry
and position properties to the 'Peter' node.

Result

p.name p.age p.hungry p.position

"Peter" 38 true "Entrepreneur"

Rows: 1
Properties set: 3

In contrast to the property replacement operator =, providing an empty map as the right operand to =`
will not remove any existing properties from a node or relationship. In line with the semantics
detailed above, passing in an empty map with `= will have no effect:

Query

MATCH (p {name: 'Peter'})


SET p += {}
RETURN p.name, p.age

Result

p.name p.age

"Peter" 34

Rows: 1

Set multiple properties using one SET clause


Set multiple properties at once by separating them with a comma:

Query

MATCH (n {name: 'Andy'})


SET n.position = 'Developer', n.surname = 'Taylor'

Result
(empty result)

128
Rows: 0
Properties set: 2

Set a property using a parameter


Use a parameter to set the value of a property:

Parameters

{
"surname": "Taylor"
}

Query

MATCH (n {name: 'Andy'})


SET n.surname = $surname
RETURN n.name, n.surname

A surname property has been added to the 'Andy' node.

Result

n.name n.surname

"Andy" "Taylor"

Rows: 1
Properties set: 1

Set all properties using a parameter


This will replace all existing properties on the node with the new set provided by the parameter.

Parameters

{
"props" : {
"name": "Andy",
"position": "Developer"
}
}

Query

MATCH (n {name: 'Andy'})


SET n = $props
RETURN n.name, n.position, n.age, n.hungry

The 'Andy' node has had all its properties replaced by the properties in the props parameter.

Result

n.name n.position n.age n.hungry

"Andy" "Developer" <null> <null>

129
n.name n.position n.age n.hungry

Rows: 1
Properties set: 4

Set a label on a node


Use SET to set a label on a node:

Query

MATCH (n {name: 'Stefan'})


SET n:German
RETURN n.name, labels(n) AS labels

The newly-labeled node is returned by the query.

Result

n.name labels

"Stefan" ["German"]

Rows: 1
Labels added: 1

Dynamically setting a label Label—new 5.24


SET can be used to set a label on a node even when the label is not statically known.

MATCH (n)
SET n:$(expr)

Query

MATCH (n:Swedish)
SET n:$(n.name)
RETURN n.name, labels(n) AS labels

The newly-labeled node is returned by the query.

Result

n.name labels

"Andy" ["Swedish", "Andy"]

Rows: 1
Labels added: 1

130
Set a label using a parameter Label—new 5.24
Use a parameter to set the value of a label:

Parameters

{
"label": "Danish"
}

Query

MATCH (n {name: 'Stefan'})


SET n:$($label)
RETURN labels(n) AS labels

A Danish label has been added to the 'Stefan' node.

Result

labels

['German', 'Danish']

Rows: 1
Labels added: 1

Set multiple labels on a node


Set multiple labels on a node with SET and use : to separate the different labels:

Query

MATCH (n {name: 'George'})


SET n:Swedish:Bossman
RETURN n.name, labels(n) AS labels

The newly-labeled node is returned by the query.

Result

n.name labels

"George" ["Swedish","Bossman"]

Rows: 1
Labels added: 2

Set multiple labels dynamically on a node Label—new 5.24


It is possible to set multiple labels dynamically using a LIST<STRING> and/or by chaining them separately
with a ::

131
Query

WITH COLLECT { UNWIND range(0,3) AS id RETURN "Label" + id } as labels ①


MATCH (n {name: 'George'})
SET n:$(labels)
RETURN n.name, labels(n) AS labels

① A COLLECT subquery aggregates the results of UNWIND range(0,3) AS id RETURN "Label" + id, which
generates a LIST<STRING> strings ("Label0", "Label1", "Label2", "Label3"), and assigns it to the variable
labels.

The newly-labeled node is returned by the query.

Result

n.name labels

"George" ["Swedish","Bossman", "Label0", "Label1", "Label2",


"Label3"]

Rows: 1
Labels added: 4

Set multiple labels using parameters Label—new 5.24


Use a parameter to set multiple labels:

Parameters

{
"labels": ["Swedish", "German"]
}

Query

MATCH (n {name: 'Peter'})


SET n:$($labels)
RETURN labels(n) AS labels

A Swedish and a German label has been added to the 'Peter' node.

Result

labels

['Swedish', 'German']

Rows: 1
Labels added: 2

SHOW FUNCTIONS
Listing the available functions can be done with SHOW FUNCTIONS.

The command SHOW FUNCTIONS returns only the default output. For a full output use the
 optional YIELD command. Full output: SHOW FUNCTIONS YIELD *.

132
The SHOW FUNCTIONS command will produce a table with the following columns:

List functions output

Column Description Type

name The name of the function. Default output STRING

category The function category, for example scalar or STRING

string. Default output

description The function description. Default output STRING

signature The signature of the function. STRING

isBuiltIn Whether the function is built-in or user-defined. BOOLEAN

argumentDescription List of the arguments for the function, as map of LIST<MAP>

strings and booleans with name, type, default,


isDeprecated, and description.

returnDescription The return value type. STRING

aggregating Whether the function is aggregating or not. BOOLEAN

rolesExecution List of roles permitted to execute this function. Is LIST<STRING>

null without the SHOW ROLE privilege.

rolesBoostedExecution List of roles permitted to use boosted mode when LIST<STRING>

executing this function. Is null without the SHOW


ROLE privilege.

isDeprecated Whether the function is deprecated. New BOOLEAN

deprecatedBy The replacement function to use in case of STRING

deprecation; otherwise null. New

Syntax

 More details about the syntax descriptions can be found here.

List functions, either all or only built-in or user-defined

133
SHOW [ALL|BUILT IN|USER DEFINED] FUNCTION[S]
[YIELD { * | field[, ...] } [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

 When using the RETURN clause, the YIELD clause is mandatory and must not be omitted.

List functions that the current user can execute

SHOW [ALL|BUILT IN|USER DEFINED] FUNCTION[S] EXECUTABLE [BY CURRENT USER]


[YIELD { * | field[, ...] } [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

 When using the RETURN clause, the YIELD clause is mandatory and must not be omitted.

List functions that the specified user can execute

SHOW [ALL|BUILT IN|USER DEFINED] FUNCTION[S] EXECUTABLE BY username


[YIELD { * | field[, ...] } [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

Required privilege SHOW USER. This command cannot be used for LDAP users.

 When using the RETURN clause, the YIELD clause is mandatory and must not be omitted.

Listing all functions


To list all available functions with the default output columns, the SHOW FUNCTIONS command can be used. If
all columns are required, use SHOW FUNCTIONS YIELD *.

Query

SHOW FUNCTIONS

Result

name category description

"abs" "Numeric" "Returns the absolute value of an


INTEGER."

"abs" "Numeric" "Returns the absolute value of a


FLOAT."

"acos" "Trigonometric" "Returns the arccosine of a FLOAT in


radians."

"all" "Predicate" "Returns true if the predicate holds


for all elements in the given
LIST<ANY>."

"any" "Predicate" "Returns true if the predicate holds


for at least one element in the
given LIST<ANY>."

134
name category description

"asin" "Trigonometric" "Returns the arcsine of a FLOAT in


radians."

"atan" "Trigonometric" "Returns the arctangent of a FLOAT


in radians."

"atan2" "Trigonometric" "Returns the arctangent2 of a set of


coordinates in radians."

"avg" "Aggregating" "Returns the average of a set of


INTEGER values."

"avg" "Aggregating" "Returns the average of a set of


FLOAT values."

"avg" "Aggregating" "Returns the average of a set of


DURATION values."

"ceil" "Numeric" "Returns the smallest FLOAT that is


greater than or equal to a number
and equal to an INTEGER."

"coalesce" "Scalar" "Returns the first non-null value in


a list of expressions."

"collect" "Aggregating" "Returns a list containing the


values returned by an expression."

"cos" "Trigonometric" "Returns the cosine of a FLOAT."

"cot" "Trigonometric" "Returns the cotangent of a FLOAT."

"count" "Aggregating" "Returns the number of values or


rows."

"date" "Temporal" "Creates a DATE instant."

"date.realtime" "Temporal" "Returns the current DATE instant


using the realtime clock."

"date.statement" "Temporal" "Returns the current DATE instant


using the statement clock."

Rows: 20

The above table only displays the first 20 results of the query. For a full list of all available functions in
Cypher, see the chapter on Functions.

Listing functions with filtering on output columns


The listed functions can be filtered in multiple ways. One way is through the type keywords, BUILT IN and
USER DEFINED. A more flexible way is to use the WHERE clause. For example, getting the name of all built-in
functions starting with the letter 'a':

Query

SHOW BUILT IN FUNCTIONS YIELD name, isBuiltIn


WHERE name STARTS WITH 'a'

Result

135
name isBuiltIn

"abs" true

"abs" true

"acos" true

"all" true

"any" true

"asin" true

"atan" true

"atan2" true

"avg" true

"avg" true

"avg" true

Rows: 11

Listing functions with other filtering


The listed functions can also be filtered on whether a user can execute them. This filtering is only available
through the EXECUTABLE clause and not through the WHERE clause. This is due to using the user’s privileges
instead of filtering on the available output columns.

There are two options, how to use the EXECUTABLE clause. The first option, is to filter for the current user:

Query

SHOW FUNCTIONS EXECUTABLE BY CURRENT USER YIELD *

Result

name category description rolesExecution rolesBoostedExecu …


tion

"abs" "Numeric" "Returns the <null> <null>


absolute value of
an INTEGER."

"abs" "Numeric" "Returns the <null> <null>


absolute value of
a FLOAT."

"acos" "Trigonometric" "Returns the <null> <null>


arccosine of a
FLOAT in
radians."

"all" "Predicate" "Returns true if <null> <null>


the predicate
holds for all
elements in the
given LIST<ANY>."

136
name category description rolesExecution rolesBoostedExecu …
tion

"any" "Predicate" "Returns true if <null> <null>


the predicate
holds for at
least one element
in the given
LIST<ANY>."

"asin" "Trigonometric" "Returns the <null> <null>


arcsine of a
FLOAT in
radians."

"atan" "Trigonometric" "Returns the <null> <null>


arctangent of a
FLOAT in
radians."

"atan2" "Trigonometric" "Returns the <null> <null>


arctangent2 of a
set of
coordinates in
radians."

"avg" "Aggregating" "Returns the <null> <null>


average of a set
of INTEGER
values."

"avg" "Aggregating" "Returns the <null> <null>


average of a set
of FLOAT values."

Rows: 10

Notice that the two roles columns are empty due to missing the SHOW ROLE privilege. Also note that the
following columns are not present in the table:

• signature

• isBuiltIn

• argumentDescription

• returnDescription

• aggregating

• isDeprecated

• deprecatedBy

The second option, is to filter for a specific user:

Query

SHOW FUNCTIONS EXECUTABLE BY jake

Result

137
name category description

"abs" "Numeric" "Returns the absolute value of an


INTEGER."

"abs" "Numeric" "Returns the absolute value of a


FLOAT."

"acos" "Trigonometric" "Returns the arccosine of a FLOAT in


radians."

"all" "Predicate" "Returns true if the predicate holds


for all elements in the given
LIST<ANY>."

"any" "Predicate" "Returns true if the predicate holds


for at least one element in the
given LIST<ANY>."

"asin" "Trigonometric" "Returns the arcsine of a FLOAT in


radians."

"atan" "Trigonometric" "Returns the arctangent of a FLOAT


in radians."

"atan2" "Trigonometric" "Returns the arctangent2 of a set of


coordinates in radians."

"avg" "Aggregating" "Returns the average of a set of


INTEGER values."

"avg" "Aggregating" "Returns the average of a set of


FLOAT values."

Rows: 10

SHOW PROCEDURES
Listing the available procedures can be done with SHOW PROCEDURES.

The command SHOW PROCEDURES returns only the default output. For a full output use the
 optional YIELD command. Full output: SHOW PROCEDURES YIELD *.

This command will produce a table with the following columns:

List procedures output

Column Description Type

name The name of the procedure. Default output STRING

description The procedure description. Default output STRING

mode The procedure mode, for example READ or WRITE. STRING

Default output

worksOnSystem Whether the procedure can be run on the system BOOLEAN

database or not. Default output

138
Column Description Type

signature The signature of the procedure. STRING

argumentDescription List of the arguments for the procedure, as map of LIST<MAP>

strings and booleans with name, type, default,


isDeprecated, and description.

returnDescription List of the returned values for the procedure, as LIST<MAP>

map of strings and booleans with name, type,


isDeprecated, and description.

admin true if this procedure is an admin procedure. BOOLEAN

rolesExecution List of roles permitted to execute this procedure. Is LIST<STRING>

null without the SHOW ROLE privilege.

rolesBoostedExecution List of roles permitted to use boosted mode when LIST<STRING>

executing this procedure. Is null without the SHOW


ROLE privilege.

isDeprecated Whether the procedure is deprecated. New BOOLEAN

deprecatedBy The replacement procedure to use in case of STRING

deprecation; otherwise null. New

option Map of extra output, e.g. if the procedure is MAP

deprecated.

The deprecation information for procedures is returned both in the isDeprecated and option columns.

Syntax

 More details about the syntax descriptions can be found here.

List all procedures

SHOW PROCEDURE[S]
[YIELD { * | field[, ...] } [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

 When using the RETURN clause, the YIELD clause is mandatory and must not be omitted.

139
List procedures that the current user can execute

SHOW PROCEDURE[S] EXECUTABLE [BY CURRENT USER]


[YIELD { * | field[, ...] } [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

 When using the RETURN clause, the YIELD clause is mandatory and must not be omitted.

List procedures that the specified user can execute

SHOW PROCEDURE[S] EXECUTABLE BY username


[YIELD { * | field[, ...] } [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

Requires the privilege SHOW USER. This command cannot be used for LDAP users.

 When using the RETURN clause, the YIELD clause is mandatory and must not be omitted.

Listing all procedures


To list all available procedures with the default output columns, the SHOW PROCEDURES command can be
used. If all columns are required, use SHOW PROCEDURES YIELD *.

Query

SHOW PROCEDURES

Result

name description mode worksOnSystem

"cdc.current" "Returns the current change "READ" false


identifier that can be used to
stream changes from."

"cdc.earliest" "Returns the earliest change "READ" false


identifier that can be used to
stream changes from."

"cdc.query" "Query changes observed by the "READ" false


provided change identifier."

"db.awaitIndex" "Wait for an index to come online "READ" true


(for example: CALL
db.awaitIndex("MyIndex", 300))."

"db.awaitIndexes" "Wait for all indexes to come online "READ" true


(for example: CALL
db.awaitIndexes(300))."

"db.checkpoint" "Initiate and wait for a new check "DBMS" true


point, or wait any already on-going
check point to complete. Note that
this temporarily disables the
dbms.checkpoint.iops.limit setting
in order to make the check point
complete faster. This might cause
transaction throughput to degrade
slightly, due to increased IO load."

140
name description mode worksOnSystem

"db.clearQueryCaches" "Clears all query caches." "DBMS" true

"db.createLabel" "Create a label" "WRITE" false

"db.createProperty" "Create a Property" "WRITE" false

"db.createRelationshipType" "Create a RelationshipType" "WRITE" false

"db.index.fulltext.awaitEventuallyCo "Wait for the updates from recently "READ" true


nsistentIndexRefresh" committed transactions to be applied
to any eventually-consistent full-
text indexes."

"db.index.fulltext.listAvailableAnal "List the available analyzers that "READ" true


yzers" the full-text indexes can be
configured with."

"db.index.fulltext.queryNodes" "Query the given full-text index. "READ" true


Returns the matching nodes, and
their Lucene query score, ordered by
score. Valid keys for the options
map are: 'skip' to skip the top N
results; 'limit' to limit the number
of results returned; 'analyzer' to
use the specified analyzer as search
analyzer for this query."

"db.index.fulltext.queryRelationship "Query the given full-text index. "READ" true


s" Returns the matching relationships,
and their Lucene query score,
ordered by score. Valid keys for the
options map are: 'skip' to skip the
top N results; 'limit' to limit the
number of results returned;
'analyzer' to use the specified
analyzer as search analyzer for this
query."

"db.info" "Provides information regarding the "READ" true


database."

"db.labels" "List all available labels in the "READ" true


database."

"db.listLocks" "List all locks in the database." "DBMS" true

"db.ping" "This procedure can be used by "READ" true


client side tooling to test whether
they are correctly connected to a
database. The procedure is available
in all databases and always returns
true. A faulty connection can be
detected by not being able to call
this procedure."

Rows: 15

The above table only displays the first 15 results of the query. For a full list of all built-in procedures in
Neo4j, visit the Operations Manual → List of procedures.

Listing procedures with filtering on output columns


The listed procedures can be filtered in multiple ways, one way is to use the WHERE clause. For example,
returning the names of all admin procedures:

141
Query

SHOW PROCEDURES YIELD name, admin


WHERE admin

Result

name admin

"db.clearQueryCaches" true

"db.listLocks" true

"db.prepareForReplanning" true

"db.stats.clear" true

"db.stats.collect" true

"db.stats.retrieve" true

"db.stats.retrieveAllAnonymized" true

"db.stats.status" true

"db.stats.stop" true

"dbms.checkConfigValue" true

"dbms.cluster.checkConnectivity" true

"dbms.cluster.cordonServer" true

"dbms.cluster.readReplicaToggle" true

"dbms.cluster.uncordonServer" true

"dbms.listConfig" true

Rows: 15

The above table only displays the first 15 results of the query. For a full list of all procedures which require
admin privileges in Neo4j, visit the Operations Manual → List of procedures.

Listing procedures with other filtering


The listed procedures can also be filtered by whether a user can execute them. This filtering is only
available through the EXECUTABLE clause and not through the WHERE clause. This is due to using the user’s
privileges instead of filtering on the available output columns.

There are two options for using the EXECUTABLE clause. The first option is to filter for the current user:

Query

SHOW PROCEDURES EXECUTABLE BY CURRENT USER YIELD *

Result

142
name description rolesExecution rolesBoostedEx …
ecution

"db.awaitIndex" "Wait for an index to come <null> <null>


online (for example: CALL
db.awaitIndex("MyIndex",
300))."

"db.awaitIndexes" "Wait for all indexes to come <null> <null>


online (for example: CALL
db.awaitIndexes(300))."

"db.checkpoint" "Initiate and wait for a new <null> <null>


check point, or wait any
already on-going check point
to complete. Note that this
temporarily disables the
dbms.checkpoint.iops.limit
setting in order to make the
check point complete faster.
This might cause transaction
throughput to degrade
slightly, due to increased IO
load."

"db.clearQueryCaches" "Clears all query caches." <null> <null>

"db.createLabel" "Create a label" <null> <null>

"db.createProperty" "Create a Property" <null> <null>

"db.createRelationshipType" "Create a RelationshipType" <null> <null>

"db.index.fulltext.awaitEventu "Wait for the updates from <null> <null>


allyConsistentIndexRefresh" recently committed
transactions to be applied to
any eventually-consistent
full-text indexes."

"db.index.fulltext.listAvailab "List the available analyzers <null> <null>


leAnalyzers" that the full-text indexes can
be configured with."

"db.index.fulltext.queryNodes" "Query the given full-text <null> <null>


index. Returns the matching
nodes, and their Lucene query
score, ordered by score. Valid
keys for the options map are:
'skip' to skip the top N
results; 'limit' to limit the
number of results returned;
'analyzer' to use the
specified analyzer as search
analyzer for this query."

"db.index.fulltext.queryRelati "Query the given full-text <null> <null>


onships" index. Returns the matching
relationships, and their
Lucene query score, ordered by
score. Valid keys for the
options map are: 'skip' to
skip the top N results;
'limit' to limit the number of
results returned; 'analyzer'
to use the specified analyzer
as search analyzer for this
query."

"db.info" "Provides information <null> <null>


regarding the database."

143
name description rolesExecution rolesBoostedEx …
ecution

"db.labels" "List all available labels in <null> <null>


the database."

"db.listLocks" "List all locks in the <null> <null>


database."

"db.ping" "This procedure can be used by <null> <null>


client side tooling to test
whether they are correctly
connected to a database. The
procedure is available in all
databases and always returns
true. A faulty connection can
be detected by not being able
to call this procedure."

Rows: 15

The above table only displays the first 15 results of the query. Note that the two roles columns are empty
due to missing the SHOW ROLE privilege. Also note that the following columns are not present in the table:

• mode

• worksOnSystem

• signature

• argumentDescription

• returnDescription

• admin

• isDeprecated

• deprecatedBy

• options

The second option for using the EXECUTABLE clause is to filter the list to only contain procedures executable
by a specific user. The below example shows the procedures available to the user jake, who has been
granted the EXECUTE PROCEDURE dbms.* privilege by the admin of the database. (More information about
DBMS EXECUTE privilege administration can be found in the Operations Manual → The DBMS EXECUTE
privileges).

Query

SHOW PROCEDURES EXECUTABLE BY jake

Result

name description mode worksOnSystem

"dbms.cluster.protocols" "Overview of installed protocols." "DBMS" true

144
name description mode worksOnSystem

"dbms.cluster.routing.getRoutingTabl "Returns the advertised bolt capable "DBMS" true


e" endpoints for a given database,
divided by each endpoint’s
capabilities. For example an
endpoint may serve read queries,
write queries and/or future
getRoutingTable requests."

"dbms.components" "List DBMS components and their "DBMS" true


versions."

"dbms.info" "Provides information regarding the "DBMS" true


DBMS."

"dbms.killConnection "Kill network connection with the "DBMS" false


given connection id."

"dbms.killConnections" "Kill all network connections with "DBMS" true


the given connection ids."

"dbms.listActiveLocks" "List the active lock requests "DBMS" true


granted for the transaction
executing the query with the given
query id."

"dbms.listCapabilities" "List capabilities" "DBMS" true

"dbms.listConnections" "List all accepted network "DBMS" true


connections at this instance that
are visible to the user."

"dbms.listPools" "List all memory pools, including "DBMS" true


sub pools, currently registered at
this instance that are visible to
the user."

"dbms.queryJmx" "Query JMX management data by domain "DBMS" true


and name. For instance, ":""

"dbms.routing.getRoutingTable" "Returns the advertised bolt capable "DBMS" true


endpoints for a given database,
divided by each endpoint’s
capabilities. For example an
endpoint may serve read queries,
write queries and/or future
getRoutingTable requests."

"dbms.showCurrentUser" "Shows the current user." "DBMS" true

Rows: 13

SHOW SETTINGS
Listing the configuration settings on a server can be done with SHOW SETTINGS.

The command SHOW SETTINGS returns settings on the executing server only. To retrieve
 settings on a specific server, you need to directly connect to it using bolt scheme.

The command SHOW SETTINGS returns only the default output. For a full output use the
 optional YIELD command. Full output: SHOW SETTINGS YIELD *.

The SHOW SETTINGS command will produce a table with the following columns:

145
Show settings output

Column Description Type

name The name of the setting. Default output STRING

value The current value of the setting. Default output STRING

isDynamic Whether the value of the setting can be updated BOOLEAN

dynamically, without restarting the server. For


dynamically updating a setting value, see Update
dynamic settings. Default output

defaultValue The default value of the setting. Default output STRING

description The setting description. Default output STRING

startupValue The value of the setting at last startup. STRING

isExplicitlySet Whether the value of the setting is explicitly set by BOOLEAN

the user, either through configuration or


dynamically.

validValues A description of valid values for the setting. STRING

isDeprecated Whether the setting is deprecated. New BOOLEAN

Syntax

 More details about the syntax descriptions can be found here.

List settings

SHOW SETTING[S] [setting-name[,...]]


[YIELD { * | field[, ...] } [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

Setting names must be supplied as one or more comma-separated quoted STRING values or as an
expression resolving to a STRING or a LIST<STRING>.

 When using the RETURN clause, the YIELD clause is mandatory and must not be omitted.

146
Listing all settings
To list all settings with the default output columns, the SHOW SETTINGS command can be used. If all
columns are required, use SHOW SETTINGS YIELD *.

Query

SHOW SETTINGS

Result

name value isDynamic defaultValue description

"browser.allow_outgoing_co "true" false "true" "Configure the policy for outgoing Neo4j
nnections" Browser connections."

"browser.credential_timeou "0s" false "0s" "Configure the Neo4j Browser to time out
t" logged in users after this idle period.
Setting this to 0 indicates no limit."

"browser.post_connect_cmd" "" false "" "Commands to be run when Neo4j Browser


successfully connects to this server.
Separate multiple commands with semi-
colon."

"browser.remote_content_ho "guides.neo4 false "guides.neo4 "Whitelist of hosts for the Neo4j


stname_whitelist" j.com,localh j.com,localh Browser to be allowed to fetch content
ost" ost" from."

"browser.retain_connection "true" false "true" "Configure the Neo4j Browser to store or


_credentials" not store user credentials."

"browser.retain_editor_his "true" false "true" "Configure the Neo4j Browser to store or


tory" not store user editor history."

"client.allow_telemetry" "true" false "true" "Configure client applications such as


Browser and Bloom to send Product
Analytics data."

"db.checkpoint" "PERIODIC" false "PERIODIC" "Configures the general policy for when
check-points should occur. The default
policy is the 'periodic' check-point
policy, as specified by the
'db.checkpoint.interval.tx' and
'db.checkpoint.interval.time' settings.
The Neo4j Enterprise Edition provides
two alternative policies: The first is
the 'continuous' check-point policy,
which will ignore those settings and run
the check-point process all the time.
The second is the 'volumetric' check-
point policy, which makes a best-effort
at check-pointing often enough so that
the database doesn’t get too far behind
on deleting old transaction logs in
accordance with the
'db.tx_log.rotation.retention_policy'
setting."

147
name value isDynamic defaultValue description

"db.checkpoint.interval.ti "15m" false "15m" "Configures the time interval between


me" check-points. The database will not
check-point more often than this (unless
check pointing is triggered by a
different event), but might check-point
less often than this interval, if
performing a check-point takes longer
time than the configured interval. A
check-point is a point in the
transaction logs, which recovery would
start from. Longer check-point intervals
typically mean that recovery will take
longer to complete in case of a crash.
On the other hand, a longer check-point
interval can also reduce the I/O load
that the database places on the system,
as each check-point implies a flushing
and forcing of all the store files."

"db.checkpoint.interval.tx "100000" false "100000" "Configures the transaction interval


" between check-points. The database will
not check-point more often than this
(unless check pointing is triggered by a
different event), but might check-point
less often than this interval, if
performing a check-point takes longer
time than the configured interval. A
check-point is a point in the
transaction logs, which recovery would
start from. Longer check-point intervals
typically mean that recovery will take
longer to complete in case of a crash.
On the other hand, a longer check-point
interval can also reduce the I/O load
that the database places on the system,
as each check-point implies a flushing
and forcing of all the store files. The
default is '100000' for a check-point
every 100000 transactions."

Rows: 10

The above table only displays the first 10 results of the query. For a full list of all available settings in
Neo4j, refer to Configuration settings.

Listing settings with filtering on output columns


The listed settings can be filtered by using the WHERE clause. For example, the following query returns the
name, value, and description of the first three settings starting with 'dbms':

Query

SHOW SETTINGS YIELD name, value, description


WHERE name STARTS WITH 'dbms'
RETURN name, value, description
LIMIT 3

Result

148
name value description

"dbms.cluster.catchup.client_inactiv "10m" "The catchup protocol times out if the given duration
ity_timeout" elapses with no network activity. Every message
received by the client from the server extends the
timeout duration."

"dbms.cluster.discovery.endpoints" null "A comma-separated list of endpoints that a server


should contact in order to discover other cluster
members. Typically, all cluster members, including the
current server, must be specified in this list. The
setting configures the endpoints for Discovery service
V1."

"dbms.cluster.discovery.log_level" "WARN" "The level of middleware logging."

Rows: 3

Listing specific settings


It is possible to specify which settings to return in the list by setting names.

Query

SHOW SETTINGS "server.bolt.enabled", "server.bolt.advertised_address", "server.bolt.listen_address"

Result

name value isDynamic defaultValue description

"server.bolt.advertised_ad "localhost:7 false ":7687" "Advertised address for this connector."


dress" 687"

"server.bolt.enabled" "true" false "true" "Enable the bolt connector."

"server.bolt.listen_addres "localhost:7 false ":7687" "Address the connector should bind to."
s" 687"

Rows: 3

SKIP
SKIP (and its synonym OFFSET) defines from which row to start including the rows in the output.

By using SKIP, the result set will get trimmed from the top.

Neo4j does not guarantee the results generated by SKIP/OFFSET. The only clause that
 guarantees a specific row order is ORDER BY.

SKIP accepts any expression that evaluates to a positive INTEGER and does not refer to nodes or
relationships.

Example graph
The following graph is used for the examples below:

149
name: 'Andy'

Person KNOWS Person KNOWS Person

name: 'Bernard' name: 'Erika'

KN
S
W

O
O

W
KN

S
Person Person

name: 'Charlotte' name: 'David'

To recreate it, run the following query against an empty Neo4j database:

CREATE
(andy: Person {name: 'Andy'}),
(bernard: Person {name: 'Bernard'}),
(charlotte: Person {name: 'Charlotte'}),
(david: Person {name: 'David'}),
(erika: Person {name: 'Erika'}),
(andy)-[:KNOWS]->(bernard),
(andy)-[:KNOWS]->(charlotte),
(andy)-[:KNOWS]->(david),
(andy)-[:KNOWS]->(erika)

Examples
Example 35. Skip the first three rows

The following query returns a subset of the result, starting from the fourth result.

Query

MATCH (n)
RETURN n.name
ORDER BY n.name
SKIP 3

Result

n.name

"David"

"Erika"

Rows: 2

150
Example 36. Return the middle two rows

The following query returns the middle two rows, with SKIP skipping the first and LIMIT removing the
final two.

Query

MATCH (n)
RETURN n.name
ORDER BY n.name
SKIP 1
LIMIT 2

Result

n.name

"Bernard"

"Charlotte"

Rows: 2

Example 37. Using an expression with SKIP to return a subset of the rows

SKIP accepts any expression that evaluates to a positive INTEGER, as long as it can be statically
calculated (i.e. calculated before the query is run).

This query skips the first row and then randomly skips an additional 0, 1, or 2 rows, resulting in
skipping a total of 1, 2, or 3 rows before returning the remaining names.

Query

MATCH (n)
RETURN n.name
ORDER BY n.name
SKIP 1 + toInteger(3 * rand())

Result

n.name

"Bernard"

"Charlotte"

"David"

"Erika"

Rows: 4

Using SKIP as a standalone clause Label—new 5.24


SKIP can be used as a standalone clause, or in conjunction with ORDER BY or LIMIT.

151
Standalone use of SKIP

MATCH (n)
SKIP 2
RETURN collect(n.name) AS names

Result

names

["Charlotte", "David", "Erika"]

Rows: 1

The following query orders all nodes by name, skips the two first rows and limits the results to two rows. It
then collects the results in a list.

SKIP used in conjunction with ORDER BY and LIMIT

MATCH (n)
ORDER BY n.name
SKIP 2
LIMIT 2
RETURN collect(n.name) AS names

Result

names

["Charlotte", "David"]

Rows: 1

OFFSET as a synonym to SKIP Label—new 5.24


OFFSET was introduced as part of Cypher’s GQL conformance and can be used as a synonym to SKIP.

Query

MATCH (n)
ORDER BY n.name
OFFSET 2
LIMIT 2
RETURN collect(n.name) AS names

Result

names

["Charlotte", "David"]

Rows: 1

UNION
UNION combines the results of two or more queries into a single result set that includes all the rows that
belong to any queries in the union.

152
The number and the names of the columns must be identical in all queries combined by using UNION.

To keep all the result rows, use UNION ALL. Using just UNION (or UNION DISTINCT) will combine and remove
duplicates from the result set.

If any of the queries in a UNION contain updates, the order of queries in the UNION is
relevant.

 Any clause before the UNION cannot observe writes made by a clause after the UNION.
Any clause after UNION can observe all writes made by a clause before the UNION.

See clause composition in queries with UNION for details.

Example graph
The following graph is used for the examples below:

name: 'Johnny Depp' name: 'Sarah Jessica Parker'

Actor Actor
AC N
TE
D_I
D_
IN TE
title: 'Ed wood AC name: 'Ed Wood'

Actor
Movie Director

To recreate the graph, run the following query against an empty Neo4j database:

CREATE (johnny:Actor {name: 'Johnny Depp'}),


(sarah:Actor {name: 'Sarah Jessica Parker'}),
(ed:Actor&Director {name: 'Ed Wood'}),
(edWoodMovie:Movie {title: 'Ed Wood'}),
(johnny)-[:ACTED_IN]->(edWoodMovie),
(sarah)-[:ACTED_IN]->(edWoodMovie)

Combine two queries and retain duplicates


Combining the results from two queries is done using UNION ALL.

Query

MATCH (n:Actor)
RETURN n.name AS name
UNION ALL
MATCH (n:Movie)
RETURN n.title AS name

The combined result is returned, including duplicates.

153
Result

name

"Johnny Depp"

"Sarah Jessica Parker"

"Ed Wood"

"Ed Wood"

Rows: 4

Combine two queries and remove duplicates


By not including ALL in the UNION, duplicates are removed from the combined result set.

Query

MATCH (n:Actor)
RETURN n.name AS name
UNION
MATCH (n:Movie)
RETURN n.title AS name

The combined result is returned, without duplicates.

Result

name

"Johnny Depp"

"Sarah Jessica Parker"

"Ed Wood"

Rows: 3

UNION DISTINCT Label—new 5.19


Removal of duplicates can also be accomplished by explicitly including DISTINCT in the UNION. The UNION
DISTINCT keyword was introduced as part of Cypher’s GQL conformance, and using it is functionally the
same as using simple UNION.

Query

MATCH (n:Actor)
RETURN n.name AS name
UNION DISTINCT
MATCH (n:Movie)
RETURN n.title AS name

The combined result is returned, without duplicates.

Result

154
name

"Johnny Depp"

"Sarah Jessica Parker"

"Ed Wood"

Rows: 3

Post-union processing
The UNION clause can be used within a CALL subquery to further process the combined results before a final
output is returned. For example, the below query counts the occurrences of each name property returned
after the UNION ALL within the CALL subquery.

The below query uses an empty variable scope clause: CALL () { … } (introduced in

 Neo4j 5.23). If you are using an older version of Neo4j, use CALL { … } instead. For
more information, see CALL subqueries → Importing variables.

Query

CALL () {
MATCH (a:Actor)
RETURN a.name AS name
UNION ALL
MATCH (m:Movie)
RETURN m.title AS name
}
RETURN name, count(*) AS count
ORDER BY count

Result

name count

"Ed Wood" 2

"Johnny Depp" 1

"Sarah Jessica Parker" 1

Rows: 3

For more information, see CALL subqueries → Post-union processing.

UNWIND
The UNWIND clause makes it possible to transform any list back into individual rows. These lists can be
parameters that were passed in, previously collect-ed result, or other list expressions.

Neo4j does not guarantee the row order produced by UNWIND. The only clause that
 guarantees a specific row order is ORDER BY.

Common usage of the UNWIND clause:

• Create distinct lists.

155
• Create data from parameter lists that are provided to the query.

 The UNWIND clause requires you to specify a new name for the inner values.

Unwinding a list
We want to transform the literal list into rows named x and return them.

Query

UNWIND [1, 2, 3, null] AS x


RETURN x, 'val' AS y

Each value of the original list — including null — is returned as an individual row.

Result

x y

1 "val"

2 "val"

3 "val"

<null> "val"

Rows: 4

Creating a distinct list


We want to transform a list of duplicates into a set using DISTINCT.

Query

WITH [1, 1, 2, 2] AS coll


UNWIND coll AS x
WITH DISTINCT x
RETURN collect(x) AS setOfVals

Each value of the original list is unwound and passed through DISTINCT to create a unique set.

Result

setOfVals

[1,2]

Rows: 1

Using UNWIND with any expression returning a list


Any expression that returns a list may be used with UNWIND.

156
Query

WITH
[1, 2] AS a,
[3, 4] AS b
UNWIND (a + b) AS x
RETURN x

The two lists — a and b — are concatenated to form a new list, which is then operated upon by UNWIND.

Result

Rows: 4

Using UNWIND with a list of lists


Multiple UNWIND clauses can be chained to unwind nested list elements.

Query

WITH [[1, 2], [3, 4], 5] AS nested


UNWIND nested AS x
UNWIND x AS y
RETURN y

The first UNWIND results in three rows for x, each of which contains an element of the original list (two of
which are also lists); namely, [1, 2], [3, 4], and 5. The second UNWIND then operates on each of these
rows in turn, resulting in five rows for y.

Result

Rows: 5

Using UNWIND with an empty list


Using an empty list with UNWIND will produce no rows, irrespective of whether or not any rows existed
beforehand, or whether or not other values are being projected.

157
Essentially, UNWIND [] reduces the number of rows to zero, and thus causes the query to cease its
execution, returning no results. This has value in cases such as UNWIND v, where v is a variable from an
earlier clause that may or may not be an empty list — when it is an empty list, this will behave just as a
MATCH that has no results.

Query

UNWIND [] AS empty
RETURN 'literal_that_is_not_returned'

Result
(empty result)

Rows: 0

To avoid inadvertently using UNWIND on an empty list, CASE may be used to replace an empty list with a
null:

WITH [] AS list
UNWIND
CASE
WHEN list = [] THEN [null]
ELSE list
END AS emptylist
RETURN emptylist

Using UNWIND with an expression that is not a list


Using UNWIND on an expression that does not return a list, will return the same result as using UNWIND on a
list that just contains that expression. As an example, UNWIND 5 is effectively equivalent to UNWIND[5]. The
exception to this is when the expression returns null — this will reduce the number of rows to zero,
causing it to cease its execution and return no results.

Query

UNWIND null AS x
RETURN x, 'some_literal'

Result
(empty result)

Rows: 0

Creating nodes from a list parameter


Create a number of nodes and relationships from a parameter-list without using FOREACH.

158
Parameters

{
"events" : [ {
"year" : 2014,
"id" : 1
}, {
"year" : 2014,
"id" : 2
} ]
}

Query

UNWIND $events AS event


MERGE (y:Year {year: event.year})
MERGE (y)<-[:IN]-(e:Event {id: event.id})
RETURN e.id AS x ORDER BY x

Each value of the original list is unwound and passed through MERGE to find or create the nodes and
relationships.

Result

Rows: 2
Nodes created: 3
Relationships created: 2
Properties set: 3
Labels added: 3

USE
The USE clause determines which graph a query, or query part, is executed against. It is supported for
queries and schema commands.

Syntax
The USE clause can only appear as the prefix of schema commands, or as the first clause of queries:

USE <graph>
<other clauses>

Where <graph> refers to the name or alias of a database in the DBMS.

Composite database syntax


When running queries against a composite database, the USE clause can also appear as the first clause of:

• Union parts:

159
USE <graph>
<other clauses>
UNION
USE <graph>
<other clauses>

• Subqueries:

CALL () {
USE <graph>
<other clauses>
}

In subqueries, a USE clause may appear directly following the variable scope clause: CALL () { … }
(introduced in Neo4j 5.23). Or, if you are using an older version of Neo4j, directly following an
importing WITH clause.

When executing queries against a composite database, the USE clause must only refer to graphs that are
part of the current composite database.

Examples

Query a graph
In this example it is assumed that the DBMS contains a database named myDatabase:

Query

USE myDatabase
MATCH (n) RETURN n

Query a composite database constituent graph


In this example it is assumed that the DBMS contains a composite database named myComposite, which
includes an alias named myConstituent:

Query

USE myComposite.myConstituent
MATCH (n) RETURN n

Query a composite database constituent graph dynamically


The built-in function graph.byName() can be used in the USE clause to resolve a constituent graph from a
STRING value containing the qualified name of a constituent.

This example uses a composite database named myComposite that includes an alias named myConstituent:

160
Query

USE graph.byName('myComposite.myConstituent')
MATCH (n) RETURN n

The argument can be any expression that evaluates to the name of a constituent graph - for example a
parameter:

Query

USE graph.byName($graphName)
MATCH (n) RETURN n

Query a composite database constituent using elementId


The graph.byElementId() function (introduced in Neo4j 5.13), can be used in the USE clause to resolve a
constituent graph to which a given element id belongs. In the below example, it is assumed that the DBMS
contains a composite database constituent, which contains the element id 4:c0a65d96-4993-4b0c-b036-
e7ebd9174905:0. If the constituent database is not a standard database in the DBMS an error will be
thrown: .Query

USE graph.byElementId("4:c0a65d96-4993-4b0c-b036-e7ebd9174905:0")
MATCH (n) RETURN n

WHERE

Introduction
The WHERE clause is not a clause in its own right — rather, it is part of the MATCH, OPTIONAL MATCH, and WITH
clauses.

When used with MATCH and OPTIONAL MATCH, WHERE adds constraints to the patterns described. It should not
be seen as a filter after the matching is finished.

In the case of WITH, however, WHERE simply filters the results.

In the case of multiple MATCH / OPTIONAL MATCH clauses, the predicate in WHERE is always a part of the
patterns in the directly preceding MATCH / OPTIONAL MATCH. Both results and performance may be impacted
if WHERE is put inside the wrong MATCH clause.

Indexes may be used to optimize queries using WHERE in a variety of cases.

Example graph
The following graph is used for the examples below:

161
Person KNOWS
since: 1999
Person
Swedish
name: 'Andy'
age: 36 name: 'Peter'
belt: 'white' age: 35
email: '[email protected]'
since: 2012
KNOWS

name: 'Timothy'
Person age: 25

To recreate the graph, run the following query in an empty Neo4j database:

CREATE
(andy:Swedish:Person {name: 'Andy', age: 36, belt: 'white'}),
(timothy:Person {name: 'Timothy', age: 25}),
(peter:Person {name: 'Peter', age: 35, email: '[email protected]'}),
(andy)-[:KNOWS {since: 2012}]->(timothy),
(andy)-[:KNOWS {since: 1999}]->(peter)

Basic usage

Node pattern predicates


WHERE can appear inside a node pattern in a MATCH clause or a pattern comprehension:

Query

WITH 30 AS minAge
MATCH (a:Person WHERE a.name = 'Andy')-[:KNOWS]->(b:Person WHERE b.age > minAge)
RETURN b.name

Result

b.name

"Peter"

162
b.name

Rows: 1

When used this way, predicates in WHERE can reference the node variable that the WHERE clause belongs to,
but not other elements of the MATCH pattern.

The same rule applies to pattern comprehensions:

Query

MATCH (a:Person {name: 'Andy'})


RETURN [(a)-->(b WHERE b:Person) | b.name] AS friends

Result

friends

["Peter","Timothy"]

Rows: 1

Boolean operations
The following boolean operators can be used with the WHERE clause: AND, OR, XOR, and NOT. For more
information on how operators work with null, see the chapter on Working with null.

Query

MATCH (n:Person)
WHERE n.name = 'Peter' XOR (n.age < 30 AND n.name = 'Timothy') OR NOT (n.name = 'Timothy' OR n.name =
'Peter')
RETURN
n.name AS name,
n.age AS age
ORDER BY name

Result

name age

"Andy" 36

"Peter" 35

"Timothy" 25

Rows: 3

Filter on node label


To filter nodes by label, write a label predicate after the WHERE keyword using WHERE n:foo:

Query

MATCH (n)
WHERE n:Swedish
RETURN n.name, n.age

163
The name and age values for Andy are returned:

Result

n.name n.age

"Andy" 36

Rows: 1

Filter on node property


To filter on a node property, write your clause after the WHERE keyword:

Query

MATCH (n:Person)
WHERE n.age < 30
RETURN n.name, n.age

The name and age values for Timothy are returned because he is less than 30 years of age:

Result

n.name n.age

"Timothy" 25

Rows: 1

Filter on relationship property


To filter on a relationship property, write your clause after the WHERE keyword:

Query

MATCH (n:Person)-[k:KNOWS]->(f)
WHERE k.since < 2000
RETURN f.name, f.age, f.email

The name, age and email values for Peter are returned because Andy has known him since before 2000:

Result

f.name f.age f.email

"Peter" 35 "[email protected]"

Rows: 1

Filter on dynamically-computed node property


To filter on a property using a dynamically computed name, use square bracket syntax:

164
Query

WITH 'AGE' AS propname


MATCH (n:Person)
WHERE n[toLower(propname)] < 30
RETURN n.name, n.age

The name and age values for Timothy are returned because he is less than 30 years of age:

Result

n.name n.age

"Timothy" 25

Rows: 1

Property existence checking


Use the IS NOT NULL predicate to only include nodes or relationships in which a property exists:

Query

MATCH (n:Person)
WHERE n.belt IS NOT NULL
RETURN n.name, n.belt

The name and belt values for Andy are returned because he is the only one with a belt property:

Result

n.name n.belt

"Andy" "white"

Rows: 1

Using WITH
As WHERE is not considered a clause in its own right, its scope is not limited by a WITH directly before it.

Query

MATCH (n:Person)
WITH n.name as name
WHERE n.age = 25
RETURN name

Result

name

"Timothy"

Rows: 1

The name for Timothy is returned because the WHERE clause still acts as a filter on the MATCH. WITH reduces
the scope for the rest of the query moving forward. In this case, name is now the only variable in scope for

165
the RETURN clause.

STRING matching
The prefix and suffix of a STRING can be matched using STARTS WITH and ENDS WITH. To undertake a
substring search (that is, match regardless of the location within a STRING), use CONTAINS.

The matching is case-sensitive. Attempting to use these operators on values which are not STRING values
will return null.

Prefix STRING search using STARTS WITH


The STARTS WITH operator is used to perform case-sensitive matching on the beginning of a STRING:

Query

MATCH (n:Person)
WHERE n.name STARTS WITH 'Pet'
RETURN n.name, n.age

The name and age values for Peter are returned because his name starts with "Pet":

Result

n.name n.age

"Peter" 35

Rows: 1

Suffix STRING search using ENDS WITH


The ENDS WITH operator is used to perform case-sensitive matching on the ending of a STRING:

Query

MATCH (n:Person)
WHERE n.name ENDS WITH 'ter'
RETURN n.name, n.age

The name and age values for Peter are returned because his name ends with "ter":

Result

n.name n.age

"Peter" 35

Rows: 1

Substring search using CONTAINS


The CONTAINS operator is used to perform case-sensitive matching regardless of location within a STRING:

166
Query

MATCH (n:Person)
WHERE n.name CONTAINS 'ete'
RETURN n.name, n.age

The name and age for Peter are are returned because his name contains "ete":

Result

n.name n.age

"Peter" 35

Rows: 1

Checking if a STRING IS NORMALIZED


The IS NORMALIZED operator (introduced in Neo4j 5.17) is used to check whether the given STRING is in the
NFC Unicode normalization form:

Query

MATCH (n:Person)
WHERE n.name IS NORMALIZED
RETURN n.name AS normalizedNames

The given STRING values contain only normalized Unicode characters, therefore all the matched name
properties are returned. For more information, see the section about the normalization operator.

Result

normalizedNames

'Andy'

'Timothy'

'Peter'

Note that the IS NORMALIZED operator returns null when used on a non-STRING value. For example, RETURN
1 IS NORMALIZED returns null.

String matching negation


Use the NOT keyword to exclude all matches on given STRING from your result:

Query

MATCH (n:Person)
WHERE NOT n.name ENDS WITH 'y'
RETURN n.name, n.age

The name and age values Peter are returned because his name does not end with "y":

Result

167
n.name n.age

"Peter" 35

Rows: 1

Regular expressions
Cypher supports filtering using regular expressions. The regular expression syntax is inherited from the
Java regular expressions. This includes support for flags that change how STRING values are matched,
including case-insensitive (?i), multiline (?m), and dotall (?s).

Flags are given at the beginning of the regular expression. For an example of a regular expression flag
given at the beginning of a pattern, see the case-insensitive regular expression section.

Matching using regular expressions


To match on regular expressions, use =~ 'regexp':

Query

MATCH (n:Person)
WHERE n.name =~ 'Tim.*'
RETURN n.name, n.age

The name and age values for Timothy are returned because his name starts with "Tim".

Result

n.name n.age

"Timothy" 25

Rows: 1

Escaping in regular expressions


Characters like . or * have special meaning in a regular expression. To use these as ordinary characters,
without special meaning, escape them.

Query

MATCH (n:Person)
WHERE n.email =~ '.*\\.com'
RETURN n.name, n.age, n.email

The name, age, and email values for Peter are returned because his email ends with ".com":

Result

n.name n.age n.email

"Peter" 35 "[email protected]"

Rows: 1

168
Note that the regular expression constructs in Java regular expressions are applied only after resolving the
escaped character sequences in the given string literal. It is sometimes necessary to add additional
backslashes to express regular expression constructs. This list clarifies the combination of these two
definitions, containing the original escape sequence and the resulting character in the regular expression:

String literal sequence Resulting Regex sequence Regex match

\t Tab Tab

\\t \t Tab

\b Backspace Backspace

\\b \b Word boundary

\n Newline NewLine

\\n \n Newline

\r Carriage return Carriage return

\\r \r Carriage return

\f Form feed Form feed

\\f \f Form feed

\' Single quote Single quote

\" Double quote Double quote

\\ Backslash Backslash

\\\ \\ Backslash

\uxxxx Unicode UTF-16 code point (4 hex digits must Unicode UTF-16 code point (4 hex digits must
follow the \u) follow the \u)

\\uxxxx \uxxxx Unicode UTF-16 code point (4 hex digits must


follow the \u)

Using regular expressions with unsanitized user input makes you vulnerable to Cypher
 injection. Consider using parameters instead.

Case-insensitive regular expressions


By pre-pending a regular expression with (?i), the whole expression becomes case-insensitive:

Query

MATCH (n:Person)
WHERE n.name =~ '(?i)AND.*'
RETURN n.name, n.age

The name and age for Andy are returned because his name starts with 'AND' irrespective of casing:

Result

169
n.name n.age

"Andy" 36

Rows: 1

Path pattern expressions


Similar to existential subqueries, path pattern expressions can be used to assert whether a specified path
exists at least once in a graph. While existential subqueries are more powerful and capable of performing
anything achievable with path pattern expressions, path pattern expressions are more concise.

Path pattern expressions have the following restrictions (use cases that require extended functionality
should consider using existential subqueries instead):

• Path pattern expressions may only use a subset of graph pattern semantics.

• A path pattern expression must be a path pattern of length greater than zero. In other words, it must
contain at least one relationship or variable-length relationship.

• Path pattern expressions may not declare new variables. They can only reference existing variables.

• Path pattern expressions may only be used in positions where a boolean expression is expected. The
following sections will demonstrate how to use path pattern expressions in a WHERE clause.

Filter on patterns
Query

MATCH
(timothy:Person {name: 'Timothy'}),
(other:Person)
WHERE (other)-->(timothy)
RETURN other.name, other.age

The name and age values for nodes that have an outgoing relationship to Timothy are returned:

Result

other.name other.age

"Andy" 36

Rows: 1

Filter on patterns using NOT


The NOT operator can be used to exclude a pattern:

Query

MATCH
(peter:Person {name: 'Peter'}),
(other:Person)
WHERE NOT (other)-->(peter)
RETURN other.name, other.age

170
The name and age values for nodes that do not have an outgoing relationship to Peter are returned:

Result

other.name other.age

"Timothy" 25

"Peter" 35

Rows: 2

Filter on patterns with properties


Properties can also be added to patterns:

Query

MATCH (other:Person)
WHERE (other)-[:KNOWS]-({name: 'Timothy'})
RETURN other.name, other.age

The name and age values are returned for nodes that have a relationship with the type KNOWS connected to
Timothy:

Result

other.name other.age

"Andy" 36

Rows: 1

Lists

IN operator
To check if an element exists in a list, use the IN operator. The below query checks whether a property
exists in a literal list:

Query

MATCH (a:Person)
WHERE a.name IN ['Peter', 'Timothy']
RETURN a.name, a.age

Result

a.name a.age

"Timothy" 25

"Peter" 35

Rows: 2

171
Missing properties and values

Default to false if property is missing


As missing properties evaluate to null, the comparison in the example will evaluate to false for nodes
without the belt property:

Query

MATCH (n:Person)
WHERE n.belt = 'white'
RETURN n.name, n.age, n.belt

Only the name, age, and belt values of nodes with white belts are returned:

Result

n.name n.age n.belt

"Andy" 36 "white"

Rows: 1

Default to true if property is missing


To compare node or relationship properties against missing properties, use the IS NULL operator:

Query

MATCH (n:Person)
WHERE n.belt = 'white' OR n.belt IS NULL
RETURN n.name, n.age, n.belt
ORDER BY n.name

This returns all values for all nodes, even those without the belt property:

Result

n.name n.age n.belt

"Andy" 36 "white"

"Peter" 35 <null>

"Timothy" 25 <null>

Rows: 3

Filter on null
To test if a value or variable is null, use the IS NULL operator. To test if a value or variable is not null, use
the IS NOT NULL operator NOT(IS NULL x) also works.

172
Query

MATCH (person:Person)
WHERE person.name = 'Peter' AND person.belt IS NULL
RETURN person.name, person.age, person.belt

The name and age values for nodes that have name Peter but no belt property are returned:

Result

person.name person.age person.belt

"Peter" 35 <null>

Rows: 1

Using ranges

Simple range
To check whether an element exists within a specific range, use the inequality operators <, ⇐, >=, >:

Query

MATCH (a:Person)
WHERE a.name >= 'Peter'
RETURN a.name, a.age

The name and age values of nodes having a name property lexicographically (i.e. using the dictionary order)
greater than or equal to Peter are returned:

Result

a.name a.age

"Timothy" 25

"Peter" 35

Rows: 2

Composite range
Several inequalities can be used to construct a range:

Query

MATCH (a:Person)
WHERE a.name > 'Andy' AND a.name < 'Timothy'
RETURN a.name, a.age

The name and age values of nodes having a name property lexicographically between Andy and Timothy are
returned:

Result

173
a.name a.age

"Peter" 35

Rows: 1

Pattern element predicates


WHERE clauses can be added to pattern elements in order to specify additional constraints:

Relationship pattern predicates


WHERE can also appear inside a relationship pattern in a MATCH clause:

Query

WITH 2000 AS minYear


MATCH (a:Person)-[r:KNOWS WHERE r.since < minYear]->(b:Person)
RETURN r.since

Result

r.since

1999

Rows: 1

However, it cannot be used inside of variable length relationships, as this would lead to an error. For
example:

Query

WITH 2000 AS minYear


MATCH (a:Person)-[r:KNOWS*1..3 WHERE r.since > b.yearOfBirth]->(b:Person)
RETURN r.since

Error message

Relationship pattern predicates are not supported for variable-length relationships.

Putting predicates inside a relationship pattern can help with readability. Note that it is strictly equivalent
to using a standalone WHERE sub-clause.

Query

WITH 2000 AS minYear


MATCH (a:Person)-[r:KNOWS]->(b:Person)
WHERE r.since < minYear
RETURN r.since

Result

r.since

1999

174
r.since

Rows: 1

Relationship pattern predicates can also be used inside pattern comprehensions, where the same caveats
apply:

Query

WITH 2000 AS minYear


MATCH (a:Person {name: 'Andy'})
RETURN [(a)-[r:KNOWS WHERE r.since < minYear]->(b:Person) | r.since] AS years

Result

years

[1999]

Rows: 1

WITH
The WITH clause allows query parts to be chained together, piping the results from one to be used as
starting points or criteria in the next.

It is important to note that WITH affects variables in scope. Any variables not included in

 the WITH clause are not carried over to the rest of the query. The wildcard * can be used
to include all variables that are currently in scope.

Using WITH, you can manipulate the output before it is passed on to the following query parts.
Manipulations can be done to the shape and/or number of entries in the result set.

One common usage of WITH is to limit the number of entries passed on to other MATCH clauses. By
combining ORDER BY and LIMIT, it is possible to get the top X entries by some criteria and then bring in
additional data from the graph.

WITH can also be used to introduce new variables containing the results of expressions for use in the
following query parts (see Introducing variables for expressions). For convenience, the wildcard * expands
to all variables that are currently in scope and carries them over to the next query part (see Using the
wildcard to carry over variables).

Another use is to filter on aggregated values. WITH is used to introduce aggregates which can then be used
in predicates in WHERE. These aggregate expressions create new bindings in the results.

WITH is also used to separate reading from updating of the graph. Every part of a query must be either
read-only or write-only. When going from a writing part to a reading part, the switch must be done with a
WITH clause.

175
name = 'Anders'

BLOCKS KNOWS

name = 'Caesar' name = 'Bossman' KNOWS

KNOWS KNOWS BLOCKS

name = 'George' name = 'David'

Introducing variables for expressions


You can introduce new variables for the result of evaluating expressions.

Query

MATCH (george {name: 'George'})<--(otherPerson)


WITH otherPerson, toUpper(otherPerson.name) AS upperCaseName
WHERE upperCaseName STARTS WITH 'C'
RETURN otherPerson.name

This query returns the name of persons connected to 'George' whose name starts with a C, regardless of
capitalization.

Result

otherPerson.name

"Caesar"

Rows: 1

Using the wildcard to carry over variables


You can use the wildcard * to carry over all variables that are in scope, in addition to introducing new
variables.

Query

MATCH (person)-[r]->(otherPerson)
WITH *, type(r) AS connectionType
RETURN person.name, otherPerson.name, connectionType

This query returns the names of all related persons and the type of relationship between them.

Result

person.name otherPerson.name connectionType

"David" "Anders" "KNOWS"

"Anders" "Bossman" "KNOWS"

176
person.name otherPerson.name connectionType

"Anders" "Caesar" "BLOCKS"

"Bossman" "David" "BLOCKS"

"Bossman" "George" "KNOWS"

"Caesar" "George" "KNOWS"

Rows: 6

Filter on aggregate function results


Aggregated results have to pass through a WITH clause to be able to filter on.

Query

MATCH (david {name: 'David'})--(otherPerson)-->()


WITH otherPerson, count(*) AS foaf
WHERE foaf > 1
RETURN otherPerson.name

The name of the person connected to 'David' with the at least more than one outgoing relationship will be
returned by the query.

Result

otherPerson.name

"Anders"

Rows: 1

Sort results before using collect on them


You can sort your results before passing them to collect, thus sorting the resulting list.

Query

MATCH (n)
WITH n
ORDER BY n.name DESC
LIMIT 3
RETURN collect(n.name)

A list of the names of people in reverse order, limited to 3, is returned in a list.

Result

collect(n.name)

["George","David","Caesar"]

Rows: 1

177
Limit branching of a path search
You can match paths, limit to a certain number, and then match again using those paths as a base, as well
as any number of similar limited searches.

Query

MATCH (n {name: 'Anders'})--(m)


WITH m
ORDER BY m.name DESC
LIMIT 1
MATCH (m)--(o)
RETURN o.name

Starting at 'Anders', find all matching nodes, order by name descending and get the top result, then find all
the nodes connected to that top result, and return their names.

Result

o.name

"Anders"

"Bossman"

Rows: 2

Limit and Filtering


It is possible to limit and filter on the same WITH clause. Note that the LIMIT clause is applied before the
WHERE clause.

Query

UNWIND [1, 2, 3, 4, 5, 6] AS x
WITH x
LIMIT 5
WHERE x > 2
RETURN x

The limit is first applied, reducing the rows to the first 5 items in the list. The filter is then applied, reducing
the final result as seen below:

Result

Rows: 3

If the desired outcome is to filter and then limit, the filtering needs to occur in its own step:

Query

178
UNWIND [1, 2, 3, 4, 5, 6] AS x
WITH x
WHERE x > 2
WITH x
LIMIT 5
RETURN x

This time the filter is applied first, reducing the rows to consist of the list [3, 4, 5, 6]. Then the limit is
applied. As the limit is larger than the total number of remaining rows, all rows are returned.

Result

Rows: 4

179
Subqueries
A Cypher subquery is called from an enclosing outer query, and executes within its own scope, as defined
by { and }.

For more information, see the following sections:

• CALL subqueries

• CALL subqueries in transactions

• EXISTS subqueries

• COUNT subqueries

• COLLECT subqueries New

CALL subqueries
The CALL clause can be used to invoke subqueries that execute operations within a defined scope, thereby
optimizing data handling and query efficiency. Unlike other subqueries in Cypher, CALL subqueries can be
used to perform changes to the database (e.g. CREATE new nodes).

The CALL clause is also used for calling procedures. For descriptions of the CALL clause in
 this context, refer to the CALL procedure.

Example graph
A graph with the following schema is used for the examples below:

To recreate the graph, run the following query in an empty Neo4j database:

180
CREATE (teamA:Team {name: 'Team A'}),
(teamB:Team {name: 'Team B'}),
(teamC:Team {name: 'Team C'}),
(playerA:Player {name: 'Player A', age: 21}),
(playerB:Player {name: 'Player B', age: 23}),
(playerC:Player {name: 'Player C', age: 19}),
(playerD:Player {name: 'Player D', age: 30}),
(playerE:Player {name: 'Player E', age: 25}),
(playerF:Player {name: 'Player F', age: 35}),
(playerA)-[:PLAYS_FOR]->(teamA),
(playerB)-[:PLAYS_FOR]->(teamA),
(playerD)-[:PLAYS_FOR]->(teamB),
(playerE)-[:PLAYS_FOR]->(teamC),
(playerF)-[:PLAYS_FOR]->(teamC),
(teamA)-[:OWES {dollars: 1500}]->(teamB),
(teamA)-[:OWES {dollars: 3000}]->(teamB),
(teamB)-[:OWES {dollars: 1700}]->(teamC),
(teamC)-[:OWES {dollars: 5000}]->(teamB)

Semantics and performance


A CALL subquery is executed once for each incoming row. The variables returned in a subquery are
available to the outer scope of the enclosing query.

Example 38. Basic example

In this example, the CALL subquery executes three times, one for each row that the UNWIND clause
outputs.

Query

UNWIND [0, 1, 2] AS x
CALL () {
RETURN 'hello' AS innerReturn
}
RETURN innerReturn

Result

innerReturn

'hello'

'hello'

'hello'

Rows: 3

Each execution of a CALL subquery can observe changes from previous executions. This allows for the
accumulation of results and the progressive transformation of data within a single Cypher query.

181
Example 39. Incremental updates

In this example, each iteration of the CALL subquery adds 1 to the age of Player A and the returned
newAge reflects the age after each increment.

Incrementally update the age property of a Player

UNWIND [1, 2, 3] AS x
CALL () {
MATCH (p:Player {name: 'Player A'})
SET p.age = p.age + 1
RETURN p.age AS newAge
}
MATCH (p:Player {name: 'Player A'})
RETURN x AS iteration, newAge, p.age AS totalAge

Result

iteration newAge totalAge

1 22 24

2 23 24

3 24 24

Rows: 3

The scoping effect of a CALL subquery means that the work performed during each execution of each row
can be cleaned up as soon its execution ends, before proceeding to the next row. This allows for efficient
resource management and reduces memory overhead by ensuring that temporary data structures created
during the subquery execution do not persist beyond their usefulness. As a result, CALL subqueries can
help maintain optimal performance and scalability, especially in complex or large-scale queries.

182
Example 40. Performance

In this example, a CALL subquery is used to collect a LIST containing all players who play for a
particular team.

Collect a list of all players playing for a particular team

MATCH (t:Team)
CALL (t) {
MATCH (p:Player)-[:PLAYS_FOR]->(t)
RETURN collect(p) as players
}
RETURN t AS team, players

Result

team players

(:Team {name: "Team A"}) (:Player {name: "Player B", age: 23}), (:Player {name: "Player A",
age: 24})]

(:Team {name: "Team B"}) [(:Player {name: "Player D", age: 30})]

(:Team {name: "Team C"}) [(:Player {name: "Player F", age: 35}), (:Player {name: "Player E",
age: 25})]

Rows: 3

The CALL subquery ensures that each Team is processed separately (one row per Team node), rather
than having to hold every Team and Player node in heap memory simultaneously before collecting
them into lists. Using a CALL subquery can therefore reduce the amount of heap memory required for
an operation.

Importing variables
Variables from the outer scope must be explicitly imported into the inner scope of the CALL subquery,
either by using a variable scope clause or an importing WITH clause (deprecated). As the subquery is
evaluated for each incoming input row, the imported variables are assigned the corresponding values from
that row.

The variable scope clause Label—new 5.23


Variables can be imported into a CALL subquery using a scope clause: CALL (<variable>). Using the scope
clause disables the deprecated importing WITH clause.

A scope clause can be used to import all, specific, or none of the variables from the outer scope.

183
Example 41. Import specific variables from the outer scope

This example only imports the p variable from the outer scope and uses it to create a new, randomly
generated, rating property for each Player node. It then returns the Player node with the highest
rating.

Import one variable from the outer scope

MATCH (p:Player), (t:Team)


CALL (p) {
WITH rand() AS random
SET p.rating = random
RETURN p.name AS playerName, p.rating AS rating
}
RETURN playerName, rating, t AS team
ORDER BY rating
LIMIT 1

Result

playerName rating team

"Player C" 0.9307432039870395 "Team A"

Rows: 1

To import additional variables, include them within the parentheses after CALL, separated by commas.
For example, to import both variables from the MATCH clause in the above query, modify the scope
clause accordingly: CALL (p, t).

Example 42. Import all variables

To import all variables from the outer scope, use CALL (*). This example imports both the p and t
variables and sets a new lastUpdated property on both.

Import all variables from the outer scope

MATCH (p:Player), (t:Team)


CALL (*) {
SET p.lastUpdated = timestamp()
SET t.lastUpdated = timestamp()
}
RETURN p.name AS playerName,
p.lastUpdated AS playerUpdated,
t.name AS teamName,
t.lastUpdated AS teamUpdated
LIMIT 1

Result

playerName playerUpdated teamName teamUpdated

"Player A" 1719304206653 "Team A" 1719304206653

Rows: 1

184
Example 43. Import no variables

To import no variables from the outer scope, use CALL ().

Import no variables from the outer scope

MATCH (t:Team)
CALL () {
MATCH (p:Player)
RETURN count(p) AS totalPlayers
}
RETURN count(t) AS totalTeams, totalPlayers

Result

totalTeams totalPlayers

3 6

Rows: 1

As of Neo4j 5.23, it is deprecated to use CALL subqueries without a variable scope


clause.

Deprecated

 MATCH (t:Team)
CALL {
MATCH (p:Player)
RETURN count(p) AS totalPlayers
}
RETURN count(t) AS totalTeams, totalPlayers

Rules

• The scope clause’s variables can be globally referenced in the subquery. A subsequent WITH within the
subquery cannot delist an imported variable. The deprecated importing WITH clause behaves differently
because imported variables can only be referenced from the first line and can be delisted by
subsequent clauses.

• Variables cannot be aliased in the scope clause. Only simple variable references are allowed.

Not allowed

MATCH (t:Team)
CALL (t AS teams) {
MATCH (p:Player)-[:PLAYS_FOR]->(teams)
RETURN collect(p) as players
}
RETURN t AS teams, players

• The scope clause’s variables cannot be re-declared in the subquery.

185
Not allowed

MATCH (t:Team)
CALL (t) {
WITH 'New team' AS t
MATCH (p:Player)-[:PLAYS_FOR]->(t)
RETURN collect(p) as players
}
RETURN t AS team, players

• The subquery cannot return a variable name which already exists in the outer scope. To return
imported variables they must be renamed.

Not allowed

MATCH (t:Team)
CALL (t) {
RETURN t
}
RETURN t

Importing WITH clause Label—deprecated


Variables can also be imported into a CALL subquery using an importing WITH clause. Note that this syntax
is not GQL conformant.

Variables imported by WITH clause

MATCH (t:Team)
CALL {
WITH t
MATCH (p:Player)-[:PLAYS_FOR]->(t)
RETURN collect(p) as players
}
RETURN t AS teams, players

186
Click to read more about importing variables using the WITH clause

• Just as when using a variable scope clause, a subquery using an importing WITH clause cannot
return a variable name which already exists in the outer scope. To return imported variables they
must be renamed.

• The importing WITH clause must the first clause of a subquery (or the second clause, if directly
following a USE clause).

• It is not possible to follow an importing WITH clause with any of the following clauses: DISTINCT,
ORDER BY, WHERE, SKIP, and LIMIT.

Attempting any of the above, will throw an error. For example, the following query using a WHERE
clause after an importing WITH clause will throw an error:

Not Allowed

UNWIND [[1,2],[1,2,3,4],[1,2,3,4,5]] AS l
CALL {
WITH l
WHERE size(l) > 2
RETURN l AS largeLists
}
RETURN largeLists

Error message

Importing WITH should consist only of simple references to outside variables.


WHERE is not allowed.

A solution to this restriction, necessary for any filtering or ordering of an importing WITH clause, is to
declare a second WITH clause after the importing WITH clause. This second WITH clause will act as a
regular WITH clause. For example, the following query will not throw an error:

Allowed

UNWIND [[1,2],[1,2,3,4],[1,2,3,4,5]] AS l
CALL {
WITH l
WITH l
WHERE size(l) > 2
RETURN l AS largeLists
}
RETURN largeLists

Result

largeLists

[1, 2, 3, 4]

[1, 2, 3, 4, 5]

Rows: 2

187
Optional subquery calls Label—new 5.24
OPTIONAL CALL allows for optional execution of a CALL subquery. Similar to OPTIONAL MATCH any empty
rows produced by the OPTIONAL CALL subquery will return null.

188
Example 44. Difference between using CALL and OPTIONAL CALL

This example, which finds the team that each Player plays for, highlights the difference between
using CALL and OPTIONAL CALL.

Regular subquery CALL

MATCH (p:Player)
CALL (p) {
MATCH (p)-[:PLAYS_FOR]->(team:Team)
RETURN team
}
RETURN p.name AS playerName, team.name AS team

Result

playerName team

"Player A" "Team A"

"Player B" "Team A"

"Player D" "Team B"

"Player E" "Team C"

"Player F" "Team C"

Rows: 5

Note that no results are returned for Player C, since they are not connected to any Team with a
PLAYS_FOR relationship.

Query using regular OPTIONAL CALL

MATCH (p:Player)
OPTIONAL CALL (p) {
MATCH (p)-[:PLAYS_FOR]->(team:Team)
RETURN team
}
RETURN p.name AS playerName, team.name AS team

Now all Player nodes, regardless of whether they have any PLAYS_FOR relationships connected to a
Team, are returned.

Result

playerName team

"Player A" "Team A"

"Player B" "Team A"

"Player C" NULL

"Player D" "Team B"

"Player E" "Team C"

"Player F" "Team C"

Rows: 6

189
Execution order of CALL subqueries
The order in which rows from the outer scope are passed into subqueries is not defined. If the results of
the subquery depend on the order of these rows, use an ORDER BY clause before the CALL clause to
guarantee a specific processing order for the rows.

Example 45. Ordering results before CALL subquery

This example creates a linked list of all Player nodes in order of ascending age.

The CALL clause is relying on the incoming row ordering to ensure that a correctly ordered linked list is
created, thus the incoming rows must be ordered with a preceding ORDER BY clause.

Order results before a CALL subquery

MATCH (player:Player)
WITH player
ORDER BY player.age ASC LIMIT 1
SET player:ListHead
WITH *
MATCH (nextPlayer: Player&!ListHead)
WITH nextPlayer
ORDER BY nextPlayer.age
CALL (nextPlayer) {
MATCH (current:ListHead)
REMOVE current:ListHead
SET nextPlayer:ListHead
CREATE(current)-[:IS_YOUNGER_THAN]->(nextPlayer)
RETURN current AS from, nextPlayer AS to
}
RETURN
from.name AS name,
from.age AS age,
to.name AS closestOlderName,
to.age AS closestOlderAge

Result

name age closestOlderName closestOlderAge

"Player C" 19 "Player B" 23

"Player B" 23 "Player A" 24

"Player A" 24 "Player E" 25

"Player E" 25 "Player D" 30

"Player D" 30 "Player F" 35

Rows: 5

Post-union processing
Call subqueries can be used to further process the results of a UNION query.

190
Example 46. Using UNION within a CALL subquery

This example query finds the youngest and the oldest Player in the graph.

Find the oldest and youngest players

CALL () {
MATCH (p:Player)
RETURN p
ORDER BY p.age ASC
LIMIT 1
UNION
MATCH (p:Player)
RETURN p
ORDER BY p.age DESC
LIMIT 1
}
RETURN p.name AS playerName, p.age AS age

Result

playerName age

"Player C" 19

"Player F" 35

Rows: 2

If different parts of a result should be matched differently, with some aggregation over the whole
result, subqueries need to be used. The example below query uses a CALL subquery in combination
with UNION ALL to determine how much each Team in the graph owes or is owed.

Find how much every team is owed

MATCH (t:Team)
CALL (t) {
OPTIONAL MATCH (t)-[o:OWES]->(other:Team)
RETURN o.dollars * -1 AS moneyOwed
UNION ALL
OPTIONAL MATCH (other)-[o:OWES]->(t)
RETURN o.dollars AS moneyOwed
}
RETURN t.name AS team, sum(moneyOwed) AS amountOwed
ORDER BY amountOwed DESC

Result

team amountOwed

"Team B" 7800

"Team C" -3300

"Team A" -4500

Rows: 3

Aggregations
Returning subqueries change the number of results of the query. The result of the CALL subquery is the

191
combined result of evaluating the subquery for each input row.

Example 47. CALL subquery changing returned rows of outer query

The following example finds the name of each Player and the team they play for. No rows are
returned for Player C, since they are not connected to a Team with a PLAYS_FOR relationship. The
number of results of the subquery thus changed the number of results of the enclosing query.

Find the friends of players

MATCH (p:Player)
CALL (p) {
MATCH (p)-[:PLAYS_FOR]->(team:Team)
RETURN team.name AS team
}
RETURN p.name AS playerName, team

Result

playerName team

"Player A" "Team A"

"Player B" "Team A"

"Player D" "Team B"

"Player E" "Team C"

"Player F" "Team C"

Rows: 5

Example 48. CALL subqueries and isolated aggregations

Subqueries can also perform isolated aggregations. The below example uses the sum() function to
count how much money is owed between the Team nodes in the graph. Note that the owedAmount for
Team A is the aggregated results of two OWES relationships to Team B.

Find how much each team owes

MATCH (t:Team)
CALL (t) {
MATCH (t)-[o:OWES]->(t2:Team)
RETURN sum(o.dollars) AS owedAmount, t2.name AS owedTeam
}
RETURN t.name AS owingTeam, owedAmount, owedTeam

Result

owingTeam owedAmount owedTeam

"Team A" 4500 "Team B"

"Team B" 1700 "Team C"

"Team C" 5000 "Team B"

Rows: 4

192
Note on returning subqueries and unit subqueries
The examples above have all used subqueries which end with a RETURN clause. These subqueries are called
returning subqueries.

A subquery is evaluated for each incoming input row. Every output row of a returning subquery is
combined with the input row to build the result of the subquery. That means that a returning subquery will
influence the number of rows. If the subquery does not return any rows, there will be no rows available
after the subquery.

Subqueries without a RETURN statement are called unit subqueries. Unit subqueries are used for their ability
to alter the graph with clauses such as CREATE, MERGE, SET, and DELETE. They do not explicitly return
anything, and this means that the number of rows present after the subquery is the same as was going
into the subquery.

Unit subqueries
Unit subqueries are used for their ability to alter the graph with updating clauses. They do not impact the
amount of rows returned by the enclosing query.

This example query creates 3 clones of each existing Player node in the graph. As the subquery is a unit
subquery, it does not change the number of rows of the enclosing query.

Create cloned nodes

MATCH (p:Player)
CALL (p) {
UNWIND range (1, 3) AS i
CREATE (:Person {name: p.name})
}
RETURN count(*)

Result

count(*)

Rows: 1
Nodes created: 18
Properties set: 18
Labels added: 18

Summary
• CALL subqueries optimize data handling and query efficiency, and can perform changes to the
database.

• CALL subqueries allow for row-by-row data transformation and enable the accumulation of results
across multiple rows, facilitating complex operations that depend on intermediate or aggregated data.

• CALL subqueries can only refer to variables from the enclosing query if they are explicitly imported by
either a variable scope clause or an importing WITH clause (deprecated).

193
• All variables that are returned from a CALL subquery are afterwards available in the enclosing query.

• Returning subqueries (with RETURN clause) influence the number of output rows, while unit subqueries
(without RETURN clause) perform graph updates without changing the number of rows.

• An ORDER BY clause can be used before CALL subqueries to ensure a specific order.

• CALL subqueries can be used in combination with UNION to process and aggregate different parts of a
query result.

CALL subqueries in transactions


CALL subqueries can be made to execute in separate, inner transactions, producing intermediate commits.
This can be useful when doing large write operations, like batch updates, imports, and deletes.

To execute a CALL subquery in separate transactions, add the modifier IN TRANSACTIONS after the subquery.
An outer transaction is opened to report back the accumulated statistics for the inner transactions (created
and deleted nodes, relationships, etc.) and it will succeed or fail depending on the results of those inner
transactions. By default, inner transactions group together batches of 1000 rows. Cancelling the outer
transaction will cancel the inner ones as well.

CALL { … } IN TRANSACTIONS is only allowed in implicit transactions.

 If you are using Neo4j Browser, you must prepend any queries using CALL { … } IN
TRANSACTIONS with :auto.

The examples on this page use a variable scope clause (introduced in Neo4j 5.23) to

 import variables into the CALL subquery. If you are using an older version of Neo4j, use
an importing WITH clause instead.

Syntax

CALL {
subQuery
} IN [[concurrency] CONCURRENT] TRANSACTIONS
[OF batchSize ROW[S]]
[REPORT STATUS AS statusVar]
[ON ERROR {CONTINUE | BREAK | FAIL}];

Loading CSV data


This example uses a CSV file and the LOAD CSV clause to import data into the database. It creates nodes in
separate transactions using CALL { … } IN TRANSACTIONS:

friends.csv

1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24

194
Query

LOAD CSV FROM 'file:///friends.csv' AS line


CALL (line) {
CREATE (:Person {name: line[1], age: toInteger(line[2])})
} IN TRANSACTIONS

Result
(empty result)

Rows: 0
Nodes created: 5
Properties set: 10
Labels added: 5
Transactions committed: 1

As the size of the CSV file in this example is small, only a single separate transaction is started and
committed.

Deleting a large volume of data


Using CALL { … } IN TRANSACTIONS is the recommended way of deleting a large volume of data.

Example 49. DETACH DELETE on all nodes

Query

MATCH (n)
CALL (n) {
DETACH DELETE n
} IN TRANSACTIONS

Result
(empty result)

Rows: 0
Nodes deleted: 5
Relationships deleted: 2
Transactions committed: 1

195
Example 50. DETACH DELETE on only some nodes

The CALL { … } IN TRANSACTIONS subquery should not be modified.

Any necessary filtering can be done before the subquery.

Query

MATCH (n:Label) WHERE n.prop > 100


CALL (n) {
DETACH DELETE n
} IN TRANSACTIONS

Result
(empty result)

Rows: 0

The batching is performed on the input rows fed into CALL { … } IN TRANSACTIONS, so
the data must be supplied from outside the call in order for the batching to have an

 effect. That is why the nodes are matched outside the subqueries in the examples above.
If the MATCH clause were inside the subquery, the data deletion would run as one single
transaction.

Batching
The amount of work to do in each separate transaction can be specified in terms of how many input rows
to process before committing the current transaction and starting a new one. The number of input rows is
set with the modifier OF n ROWS (or OF n ROW). If omitted, the default batch size is 1000 rows. The number
of rows can be expressed using any expression that evaluates to a positive integer and does not refer to
nodes or relationships.

This example loads a CSV file with one transaction for every 2 input rows:

friends.csv

1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24

Query

LOAD CSV FROM 'file:///friends.csv' AS line


CALL (line) {
CREATE (:Person {name: line[1], age: toInteger(line[2])})
} IN TRANSACTIONS OF 2 ROWS

Result
(empty result)

196
Rows: 0
Nodes created: 5
Properties set: 10
Labels added: 5
Transactions committed: 3

The query now starts and commits three separate transactions:

1. The first two executions of the subquery (for the first two input rows from LOAD CSV) take place in the
first transaction.

2. The first transaction is then committed before proceeding.

3. The next two executions of the subquery (for the next two input rows) take place in a second
transaction.

4. The second transaction is committed.

5. The last execution of the subquery (for the last input row) takes place in a third transaction.

6. The third transaction is committed.

You can also use CALL { … } IN TRANSACTIONS OF n ROWS to delete all your data in batches in order to
avoid a huge garbage collection or an OutOfMemory exception. For example:

Query

MATCH (n)
CALL (n) {
DETACH DELETE n
} IN TRANSACTIONS OF 2 ROWS

Result
(empty result)

Rows: 0
Nodes deleted: 9
Relationships deleted: 2
Transactions committed: 5

Up to a point, using a larger batch size will be more performant. The batch size of 2 ROWS

 is an example given the small data set used here. For larger data sets, you might want to
use larger batch sizes, such as 10000 ROWS.

Composite databases Label—new 5.18


As of Neo4j 5.18, CALL { … } IN TRANSACTIONS can be used with composite databases.

Even though composite databases allow accessing multiple graphs in a single query, only one graph can
be modified in a single transaction. CALL { … } IN TRANSACTIONS offers a way of constructing queries
which modify multiple graphs.

While the previous examples are generally valid for composite databases, there’s a few extra factors that
come into play when working with composite databases in subqueries. The following examples show how

197
you can use CALL { … } IN TRANSACTIONS on a composite database.

Example 51. Import a CSV file on all constituents

friends.csv

1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24

Create Person nodes on all constituents, drawing data from friends.csv

1 UNWIND graph.names() AS graphName


2 LOAD CSV FROM 'file:///friends.csv' AS line
3 CALL (*) {
4 USE graph.byName( graphName )
5 CREATE (:Person {name: line[1], age: toInteger(line[2])})
6 } IN TRANSACTIONS

Example 52. Remove all nodes and relationships from all constituents

Query

1 UNWIND graph.names() AS graphName


2 CALL {
3 USE graph.byName( graphName )
4 MATCH (n)
5 RETURN elementId(n) AS id
6 }
7 CALL {
8 USE graph.byName( graphName )
9 WITH id
10 MATCH (n)
11 WHERE elementId(n) = id
12 DETACH DELETE n
13 } IN TRANSACTIONS

Since the batching is performed on the input rows fed into CALL { … } IN
TRANSACTIONS, the data must be supplied from outside the subquery in order for the

 batching to have an effect. That is why the nodes are matched in a subquery
preceding the one that actually deletes the data. If the MATCH clause were inside the
second subquery, the data deletion would run as one single transaction.

There is currently a known issue. When an error occurs during CALL { … } IN


TRANSACTIONS processing, the error message includes information about how many
 transactions have been committed. That information is inaccurate on composite
databases, as it always reports (Transactions committed: 0).

Batch size in composite databases


Because CALL { … } IN TRANSACTIONS subqueries targeting different graphs can’t be interleaved, if a USE
clause evaluates to a different target than the current one, the current batch is committed and the next

198
batch is created.

The batch size declared with IN TRANSACTIONS OF … ROWS represents an upper limit of the batch size, but
the real batch size depends on how many input rows target one database in sequence. Every time the
target database changes, the batch is committed.

Example 53. Behavior of IN TRANSACTIONS OF ROWS on composite databases

The next example assumes the existence of two constituents remoteGraph1 and remoteGraph2 for the
composite database composite.

While the declared batch size is 3, only the first 2 rows act on composite.remoteGraph1, so the batch
size for the first transaction is 2. That is followed by 3 rows on composite.remoteGraph2, 1 on
composite.remoteGraph2 and finally 2 on composite.remoteGraph1.

Query

1 WITH ['composite.remoteGraph1', 'composite.remoteGraph2'] AS graphs


2 UNWIND [0, 0, 1, 1, 1, 1, 0, 0] AS i
3 WITH graphs[i] AS g
4 CALL (g) {
5 USE graph.byName( g )
6 CREATE ()
7 } IN TRANSACTIONS OF 3 ROWS

Error behavior Label—new 5.7


Users can choose one of three different option flags to control the behavior in case of an error occurring in
any of the inner transactions of CALL { … } IN TRANSACTIONS:

• ON ERROR CONTINUE to ignore a recoverable error and continue the execution of subsequent inner
transactions. The outer transaction succeeds. It will cause the expected variables from the failed inner
query to be bound as null for that specific transaction.

• ON ERROR BREAK to ignore a recoverable error and stop the execution of subsequent inner transactions.
The outer transaction succeeds. It will cause expected variables from the failed inner query to be
bound as null for all onward transactions (including the failed one).

• ON ERROR FAIL to acknowledge a recoverable error and stop the execution of subsequent inner
transactions. The outer transaction fails. This is the default behavior if no flag is explicitly specified.

On error, any previously committed inner transactions remain committed, and are not
 rolled back. Any failed inner transactions are rolled back.

In the following example, the last subquery execution in the second inner transaction fails due to division
by zero.

199
Query

UNWIND [4, 2, 1, 0] AS i
CALL (i) {
CREATE (:Person {num: 100/i})
} IN TRANSACTIONS OF 2 ROWS
RETURN i

Error message

/ by zero (Transactions committed: 1)

When the failure occurred, the first transaction had already been committed, so the database contains two
example nodes.

Query

MATCH (e:Person)
RETURN e.num

Result

e.num

25

50

Rows: 2

In the following example, ON ERROR CONTINUE is used after a failed inner transaction to execute the
remaining inner transactions and not fail the outer transaction:

Query

UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR CONTINUE
RETURN n.num;

Result

n.num

100

null

50

25

Rows: 4

Note the difference in results when batching in transactions of 2 rows:

200
Query

UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 2 ROWS
ON ERROR CONTINUE
RETURN n.num;

Result

n.num

null

null

50

25

Rows: 4

This happens because an inner transaction with the two first i elements (1 and 0) was created, and it fails
for 0. This causes it to be rolled back and the return variable is filled with nulls for those two elements.

In the following example, ON ERROR BREAK is used after a failed inner transaction to not execute the
remaining inner transaction and not fail the outer transaction:

Query

UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR BREAK
RETURN n.num;

Result

n.num

100

null

null

null

Rows: 4

Note the difference in results when batching in transactions of 2 rows:

201
Query

UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 2 ROWS
ON ERROR BREAK
RETURN n.num;

Result

n.num

null

null

null

null

Rows: 4

In the following example, ON ERROR FAIL is used after the failed inner transaction, to not execute the
remaining inner transactions and to fail the outer transaction:

Query

UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR FAIL
RETURN n.num;

Error message

/ by zero (Transactions committed: 1)

Status report
Users can also report the execution status of the inner transactions by using REPORT STATUS AS var. This
flag is disallowed for ON ERROR FAIL. For more information, see Error behavior.

After each execution of the inner query finishes (successfully or not), a status value is created that records
information about the execution and the transaction that executed it:

• If the inner execution produces one or more rows as output, then a binding to this status value is
added to each row, under the selected variable name.

• If the inner execution fails then a single row is produced containing a binding to this status value under
the selected variable, and null bindings for all variables that should have been returned by the inner
query (if any).

The status value is a map value with the following fields:

202
• started: true when the inner transaction was started, false otherwise.

• committed, true when the inner transaction changes were successfully committed, false otherwise.

• transactionId: the inner transaction id, or null if the transaction was not started.

• errorMessage, the inner transaction error message, or null in case of no error.

Example of reporting status with ON ERROR CONTINUE:

Query

UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR CONTINUE
REPORT STATUS AS s
RETURN n.num, s;

Result

n.num s

100 {"committed": true, "errorMessage": null, "started": true, "transactionId": "neo4j-


transaction-835" }

null {"committed": false, "errorMessage": "/ by zero", "started": true, "transactionId":


"neo4j-transaction-836" }

50 {"committed": true, "errorMessage": null, "started": true, "transactionId": "neo4j-


transaction-837" }

25 {"committed": true, "errorMessage": null, "started": true, "transactionId": "neo4j-


transaction-838" }

Rows: 4

Example of reporting status with ON ERROR BREAK:

Query

UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR BREAK
REPORT STATUS AS s
RETURN n.num, s.started, s.committed, s.errorMessage;

Result

n.num s.started s.committed s.errorMessage

100 true true null

null true false "/ by zero"

null false false null

null false false null

203
n.num s.started s.committed s.errorMessage

Rows: 4

Reporting status with ON ERROR FAIL is disallowed:

Query

UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR FAIL
REPORT STATUS AS s
RETURN n.num, s.errorMessage;

Error

REPORT STATUS can only be used when specifying ON ERROR CONTINUE or ON ERROR BREAK

Concurrent transactions Label—new 5.21


By default, CALL { … } IN TRANSACTIONS is single-threaded; one CPU core is used to sequentially execute
batches.

However, CALL subqueries can also execute batches in parallel by appending IN [n] CONCURRENT
TRANSACTIONS, where n is a concurrency value used to set the maximum number of transactions that can be
executed in parallel. This allows CALL subqueries to utilize multiple CPU cores simultaneously, which can
significantly reduce the time required to execute a large, outer transaction.

The concurrency value is optional. If not specified, a default value based on the amount

 of available CPU cores will be chosen. If a negative number is specified, the concurrency
will be the number of available CPU cores reduced by the absolute value of that number.

204
Example 54. Load a CSV file in concurrent transactions

CALL { … } IN CONCURRENT TRANSACTIONS is particularly suitable for importing data without


dependencies. This example creates Person nodes from a unique tmdbId value assigned to each
person row in the CSV file (444 in total) in 3 concurrent transactions.

LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row


CALL (row) {
CREATE (p:Person {tmdbId: row.person_tmdbId})
SET p.name = row.name, p.born = row.born
} IN 3 CONCURRENT TRANSACTIONS OF 10 ROWS
RETURN count(*) AS personNodes

Result

personNodes

444

Rows: 1

Concurrency and non-deterministic results


CALL { … } IN TRANSACTIONS uses ordered semantics by default, where batches are committed in a
sequential row-by-row order. For example, in CALL { <I> } IN TRANSACTIONS, any writes done in the
execution of <I1> must be observed by <I2>, and so on.

In contrast, CALL { … } IN CONCURRENT TRANSACTIONS uses concurrent semantics, where both the number
of rows committed by a particular batch and the order of committed batches is undefined. That is, in CALL
{ <I> } IN CONCURRENT TRANSACTIONS, writes committed in the execution of <I1> may or may not be
observed by <I2>, and so on.

The results of CALL subqueries executed in concurrent transactions may, therefore, not be deterministic. To
guarantee deterministic results, ensure that the results of committed batches are not dependent on each
other.

Using CALL { … } IN CONCURRENT TRANSACTIONS can impact error behavior. Specifically,


when using ON ERROR BREAK or ON ERROR FAIL and one transaction fails, then any
concurrent transactions may not be interrupted and rolled back (though subsequent

 ones would). This is because no timing guarantees can be given for concurrent
transactions. That is, an ongoing transaction may or may not commit successfully in the
time window when the error is being handled. Use the status report to determine which
batches were committed and which failed or did not start.

Deadlocks
When a write transaction occurs, Neo4j takes locks to preserve data consistency while updating. For
example, when creating or deleting a relationship, a write lock is taken on both the specific relationship
and its connected nodes.

A deadlock happens when two transactions are blocked by each other because they are attempting to

205
concurrently modify a node or a relationship that is locked by the other transaction (for more information
about locks and deadlocks in Neo4j, see Operations Manual → Concurrent data access.

A deadlock may occur when using CALL { … } IN CONCURRENT TRANSACTIONS if the transactions for two or
more batches try to take the same locks in an order that results in a circular dependency between them. If
so, the impacted transactions are always rolled back, and an error is thrown unless the query is appended
with ON ERROR CONTINUE or ON ERROR BREAK.

206
Example 55. Dealing with deadlocks

The following query tries to create Movie and Year nodes connected by a RELEASED_IN relationship.
Note that there are only three different years in the CSV file, meaning hat only three Year nodes
should be created.

Query with concurrent transaction causing a deadlock

LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row


CALL (row) {
MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS

The deadlock occurs because the two transactions are simultaneously trying to lock and merge the
same Year.

Error message

ForsetiClient[transactionId=64, clientId=12] can't acquire


ExclusiveLock{owner=ForsetiClient[transactionId=63, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(98) because holders of that lock are waiting for
ForsetiClient[transactionId=64, clientId=12].
Wait list:ExclusiveLock[
Client[63] waits for [ForsetiClient[transactionId=64, clientId=12]]]

The following query uses ON ERROR CONTINUE to bypass any deadlocks and continue with the
execution of subsequent inner transactions. It returns the transactionID, commitStatus and
errorMessage of the failed transactions.

Query using ON ERROR CONTINUE to ignore deadlocks and complete outer transaction

LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row


CALL (row) {
MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS ON ERROR CONTINUE REPORT STATUS as status
WITH status
WHERE status.errorMessage IS NOT NULL
RETURN status.transactionId AS transaction, status.committed AS commitStatus, status.errorMessage AS
errorMessage

Result

+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------+
| transaction | commitStatus | errorMessage
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------+
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for

207
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|

208
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't
acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on
NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for
ForsetiClient[transactionId=169, clientId=11].
|
| | \ Wait list:ExclusiveLock[
|
| | \ Client[168] waits for
[ForsetiClient[transactionId=169, clientId=11]]]"
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------+

Click to see an example of failed transactions being retried using Cypher

While failed transactions may be more efficiently retried using a driver, below is an example
how failed transactions can be retried within the same Cypher query:

Query retrying failed transactions

LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row


CALL (row) {
MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS ON ERROR CONTINUE REPORT STATUS as status
WITH *
WHERE status.committed = false
CALL (row) {
MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS ON ERROR FAIL

Restrictions
These are the restrictions on queries that use CALL { … } IN TRANSACTIONS:

• A nested CALL { … } IN TRANSACTIONS inside a CALL { … } clause is not supported.

• A CALL { … } IN TRANSACTIONS in a UNION is not supported.

• A CALL { … } IN TRANSACTIONS after a write clause is not supported, unless that write clause is inside
a CALL { … } IN TRANSACTIONS.

COLLECT subqueries
A COLLECT subquery expression can be used to create a list with the rows returned by a given subquery.

COLLECT subqueries differ from COUNT and EXISTS subqueries in that the final RETURN clause is mandatory.
The RETURN clause must return exactly one column.

209
Example graph
The following graph is used for the examples below:

Person HAS_DOG
Person Person since: 2010
Dog
Swedish
name: 'Andy' name: 'Timothy' name: 'Peter' name: 'Fido'
age: 36 age: 25 age: 35
nickname: 'Tim' nickname: 'Pete'

since: 2018
HAS_DOG
since: 2016

since: 2019
HAS_DOG

HAS_CAT

HAS TOY
Dog Cat Dog Toy

name: 'Andy' name: 'Mittens' name: 'Ozzy' name: 'Banana'

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(andy:Swedish:Person {name: 'Andy', age: 36}),
(timothy:Person {name: 'Timothy', nickname: 'Tim', age: 25}),
(peter:Person {name: 'Peter', nickname: 'Pete', age: 35}),
(andy)-[:HAS_DOG {since: 2016}]->(:Dog {name:'Andy'}),
(timothy)-[:HAS_CAT {since: 2019}]->(:Cat {name:'Mittens'}),
(fido:Dog {name:'Fido'})<-[:HAS_DOG {since: 2010}]-(peter)-[:HAS_DOG {since: 2018}]->(:Dog {name:'Ozzy'}),
(fido)-[:HAS_TOY]->(:Toy{name:'Banana'})

Simple COLLECT subquery


Variables introduced by the outside scope can be used in the COLLECT subquery without importing them. In
this regard, COLLECT subqueries are different from CALL subqueries, which do require importing. The
following query exemplifies this and outputs the owners of the dog named Ozzy:

MATCH (person:Person)
WHERE 'Ozzy' IN COLLECT { MATCH (person)-[:HAS_DOG]->(dog:Dog) RETURN dog.name }
RETURN person.name AS name

name

"Peter"

Rows: 1

210
COLLECT subquery with WHERE clause
A WHERE clause can be used inside the COLLECT subquery. Variables introduced by the MATCH clause and the
outside scope can be used in the inner scope.

MATCH (person:Person)
RETURN person.name as name, COLLECT {
MATCH (person)-[r:HAS_DOG]->(dog:Dog)
WHERE r.since > 2017
RETURN dog.name
} as youngDogs

name youngDogs

"Andy" []

"Timothy" []

"Peter" ["Ozzy"]

Rows: 3

COLLECT subquery with a UNION


COLLECT can be used with a UNION clause. The below example shows the collection of pet names each
person has by using a UNION clause:

MATCH (person:Person)
RETURN
person.name AS name,
COLLECT {
MATCH (person)-[:HAS_DOG]->(dog:Dog)
RETURN dog.name AS petName
UNION
MATCH (person)-[:HAS_CAT]->(cat:Cat)
RETURN cat.name AS petName
} AS petNames

name petNames

"Andy" ["Andy"]

"Timothy" ["Mittens"]

"Peter" ["Ozzy", "Fido"]

Rows: 3

COLLECT subquery with WITH


Variables from the outside scope are visible for the entire subquery, even when using a WITH clause. To
avoid confusion, shadowing of these variables is not allowed. An outside scope variable is shadowed
when a newly introduced variable within the inner scope is defined with the same variable. In the example
below, the outer variable name is shadowed and will therefore throw an error.

211
WITH 'Peter' as name
MATCH (person:Person {name: name})
RETURN COLLECT {
WITH 'Ozzy' AS name
MATCH (person)-[r:HAS_DOG]->(d:Dog {name: name})
RETURN d.name
} as dogsOfTheYear

Error message

The variable `name` is shadowing a variable with the same name from the outer scope and needs to be
renamed (line 4, column 20 (offset: 92))

New variables can be introduced into the subquery, as long as they use a different identifier. In the
example below, a WITH clause introduces a new variable. Note that the outer scope variable person
referenced in the main query is still available after the WITH clause.

MATCH (person:Person)
RETURN person.name AS name, COLLECT {
WITH 2018 AS yearOfTheDog
MATCH (person)-[r:HAS_DOG]->(d:Dog)
WHERE r.since = yearOfTheDog
RETURN d.name
} as dogsOfTheYear

name dogsOfTheYear

"Andy" []

"Timothy" []

"Peter" ["Ozzy"]

Rows: 3

Using COLLECT subqueries inside other clauses


COLLECT can be used in any position in a query, with the exception of administration commands, where the
COLLECT expression is restricted. See a few examples below of how COLLECT can be used in different
positions within a query:

Using COLLECT in RETURN

MATCH (person:Person)
RETURN person.name,
COLLECT {
MATCH (person)-[:HAS_DOG]->(d:Dog)
MATCH (d)-[:HAS_TOY]->(t:Toy)
RETURN t.name
} as toyNames

person.name toyNames

"Andy" []

"Timothy" []

"Peter" ["Banana"]

212
person.name toyNames

Rows: 3

Using COLLECT in SET

MATCH (person:Person) WHERE person.name = "Peter"


SET person.dogNames = COLLECT { MATCH (person)-[:HAS_DOG]->(d:Dog) RETURN d.name }
RETURN person.dogNames as dogNames

dogNames

["Ozzy", "Fido"]

Rows: 1
Properties set: 1

Using COLLECT in CASE

MATCH (person:Person)
RETURN
CASE
WHEN COLLECT { MATCH (person)-[:HAS_DOG]->(d:Dog) RETURN d.name } = [] THEN "No Dogs " + person.name
ELSE person.name
END AS result

result

"Andy"

"No Dogs Timothy"

"Peter"

Rows: 3

Using COLLECT as a grouping key


The following query collects all persons by their dogs' names, and then calculates the average age for each
group.

MATCH (person:Person)
RETURN COLLECT { MATCH (person)-[:HAS_DOG]->(d:Dog) RETURN d.name } AS dogNames,
avg(person.age) AS averageAge
ORDER BY dogNames

dogNames averageAge

[] 25.0

["Andy"] 36.0

["Ozzy", "Fido"] 35.0

Rows: 3

213
Using COLLECT vs collect()
COLLECT does not handle null values in the same way that the aggregating function collect() does. The
collect() function automatically removes null values. COLLECT will not remove null values automatically.
However, they can be removed by adding a filtering step in the subquery.

The following queries illustrate these differences:

MATCH (p:Person)
RETURN collect(p.nickname) AS names

names

["Pete", "Tim"]

Rows: 1

RETURN COLLECT {
MATCH (p:Person)
RETURN p.nickname ORDER BY p.nickname
} AS names

names

["Pete", "Tim", null]

Rows: 1

RETURN COLLECT {
MATCH (p:Person)
WHERE p.nickname IS NOT NULL
RETURN p.nickname ORDER BY p.nickname
} AS names

name

["Pete", "Tim"]

Rows: 1

Rules
The following is true for COLLECT subqueries:

• Any non-writing query is allowed.

• The final RETURN clause is mandatory when using a COLLECT subquery. The RETURN clause must return
exactly one column.

• A COLLECT subquery can appear anywhere in a query that an expression is valid.

• Any variable that is defined in the outside scope can be referenced inside the COLLECT subquery’s own
scope.

• Variables introduced inside the COLLECT subquery are not part of the outside scope and therefore

214
cannot be accessed on the outside.

COUNT subqueries
A COUNT subquery can be used to count the number of rows returned by the subquery.

Example graph
The following graph is used for the examples below:

Person HAS_DOG
Person Person since: 2010
Dog
Swedish
name: 'Andy' name: 'Timothy' name: 'Peter' name: 'Fido'
age: 36 age: 25 age: 35
nickname: 'Tim' nickname: 'Pete'

since: 2018
HAS_DOG
since: 2016

since: 2019
HAS_DOG

HAS_CAT

HAS TOY
Dog Cat Dog Toy

name: 'Andy' name: 'Mittens' name: 'Ozzy' name: 'Banana'

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(andy:Swedish:Person {name: 'Andy', age: 36}),
(timothy:Person {name: 'Timothy', nickname: 'Tim', age: 25}),
(peter:Person {name: 'Peter', nickname: 'Pete', age: 35}),
(andy)-[:HAS_DOG {since: 2016}]->(:Dog {name:'Andy'}),
(timothy)-[:HAS_CAT {since: 2019}]->(:Cat {name:'Mittens'}),
(fido:Dog {name:'Fido'})<-[:HAS_DOG {since: 2010}]-(peter)-[:HAS_DOG {since: 2018}]->(:Dog {name:'Ozzy'}),
(fido)-[:HAS_TOY]->(:Toy{name:'Banana'})

Simple COUNT subquery


Variables introduced by the outside scope can be used in the COUNT subquery without importing them. In
this regard, COUNT subqueries are different from CALL subqueries, which do require importing. The following
query exemplifies this and outputs the owners of more than one dog:

MATCH (person:Person)
WHERE COUNT { (person)-[:HAS_DOG]->(:Dog) } > 1
RETURN person.name AS name

215
name

"Peter"

Rows: 1

COUNT subquery with WHERE clause


A WHERE clause can be used inside the COUNT pattern. Variables introduced by the MATCH clause and the
outside scope can be used in this scope.

MATCH (person:Person)
WHERE COUNT {
(person)-[:HAS_DOG]->(dog:Dog)
WHERE person.name = dog.name
} = 1
RETURN person.name AS name

name

"Andy"

Rows: 1

COUNT subquery with a UNION Label—new 5.3


COUNT can be used with a UNION clause. If the UNION clause is distinct, the RETURN clause is required. UNION
ALL clauses do not require the RETURN clause. However, it is worth noting that if one branch has a RETURN
clause, then all require one. The below example shows the count of pets each person has by using a UNION
clause:

MATCH (person:Person)
RETURN
person.name AS name,
COUNT {
MATCH (person)-[:HAS_DOG]->(dog:Dog)
RETURN dog.name AS petName
UNION
MATCH (person)-[:HAS_CAT]->(cat:Cat)
RETURN cat.name AS petName
} AS numPets

name numPets

"Andy" 1

"Timothy" 1

"Peter" 2

Rows: 3

COUNT subquery with WITH Label—new 5.3


Variables from the outside scope are visible for the entire subquery, even when using a WITH clause. To
avoid confusion, shadowing of these variables is not allowed. An outside scope variable is shadowed

216
when a newly introduced variable within the inner scope is defined with the same variable. In the example
below, the outer variable name is shadowed and will therefore throw an error.

WITH 'Peter' as name


MATCH (person:Person {name: name})
WHERE COUNT {
WITH "Ozzy" AS name
MATCH (person)-[:HAS_DOG]->(d:Dog)
WHERE d.name = name
} = 1
RETURN person.name AS name

Error message

The variable `name` is shadowing a variable with the same name from the outer scope and needs to be
renamed (line 4, column 20 (offset: 90))

New variables can be introduced into the subquery, as long as they use a different identifier. In the
example below, a WITH clause introduces a new variable. Note that the outer scope variable person
referenced in the main query is still available after the WITH clause.

MATCH (person:Person)
WHERE COUNT {
WITH "Ozzy" AS dogName
MATCH (person)-[:HAS_DOG]->(d:Dog)
WHERE d.name = dogName
} = 1
RETURN person.name AS name

name

"Peter"

Rows: 1

Using COUNT subqueries inside other clauses


COUNT can be used in any position in a query, with the exception of administration commands, where it is
restricted. See a few examples below:

Using COUNT in RETURN

MATCH (person:Person)
RETURN person.name, COUNT { (person)-[:HAS_DOG]->(:Dog) } as howManyDogs

person.name howManyDogs

"Andy" 1

"Timothy" 0

"Peter" 2

Rows: 3

217
Using COUNT in SET

MATCH (person:Person) WHERE person.name ="Andy"


SET person.howManyDogs = COUNT { (person)-[:HAS_DOG]->(:Dog) }
RETURN person.howManyDogs as howManyDogs

howManyDogs

Rows: 1
Properties set: 1

Using COUNT in CASE

MATCH (person:Person)
RETURN
CASE
WHEN COUNT { (person)-[:HAS_DOG]->(:Dog) } > 1 THEN "Doglover " + person.name
ELSE person.name
END AS result

result

"Andy"

"Timothy"

"Doglover Peter"

Rows: 3

Using COUNT as a grouping key


The following query groups all persons by how many dogs they own, and then calculates the average age
for each group.

MATCH (person:Person)
RETURN COUNT { (person)-[:HAS_DOG]->(:Dog) } AS numDogs,
avg(person.age) AS averageAge
ORDER BY numDogs

numDogs averageAge

0 25.0

1 36.0

2 35.0

Rows: 3

COUNT subquery with RETURN Label—new 5.3


COUNT subqueries do not require a RETURN clause at the end of the subquery. If one is present, it does not
need to be aliased. This is a difference compared to CALL subqueries. Any variables returned in a COUNT

218
subquery will not be available after the subquery.

MATCH (person:Person)
WHERE COUNT {
MATCH (person)-[:HAS_DOG]->(:Dog)
RETURN person.name
} = 1
RETURN person.name AS name

name

"Andy"

Rows: 1

Rules
The following is true for COUNT subqueries:

• Any non-writing query is allowed.

• The final RETURN clause may be omitted, as any variable defined within the subquery will not be
available outside of the expression, even if a final RETURN clause is used. One exception to this is that
for a DISTINCT UNION clause, the RETURN clause is still mandatory.

• The MATCH keyword can be omitted in subqueries in cases where the COUNT consists of only a pattern
and an optional WHERE clause.

• A COUNT subquery can appear anywhere in a query that an expression is valid.

• Any variable that is defined in the outside scope can be referenced inside the COUNT subquery’s own
scope.

• Variables introduced inside the COUNT subquery are not part of the outside scope and therefore cannot
be accessed on the outside.

EXISTS subqueries
An EXISTS subquery can be used to find out if a specified pattern exists at least once in the graph. It serves
the same purpose as a path pattern but it is more powerful because it allows you to use MATCH and WHERE
clauses internally.

Example graph
The following graph is used for the examples below:

219
Person HAS_DOG
Person Person since: 2010
Dog
Swedish
name: 'Andy' name: 'Timothy' name: 'Peter' name: 'Fido'
age: 36 age: 25 age: 35
nickname: 'Tim' nickname: 'Pete'

since: 2018
HAS_DOG
since: 2016

since: 2019
HAS_DOG

HAS_CAT

HAS TOY
Dog Cat Dog Toy

name: 'Andy' name: 'Mittens' name: 'Ozzy' name: 'Banana'

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(andy:Swedish:Person {name: 'Andy', age: 36}),
(timothy:Person {name: 'Timothy', nickname: 'Tim', age: 25}),
(peter:Person {name: 'Peter', nickname: 'Pete', age: 35}),
(andy)-[:HAS_DOG {since: 2016}]->(:Dog {name:'Andy'}),
(timothy)-[:HAS_CAT {since: 2019}]->(:Cat {name:'Mittens'}),
(fido:Dog {name:'Fido'})<-[:HAS_DOG {since: 2010}]-(peter)-[:HAS_DOG {since: 2018}]->(:Dog {name:'Ozzy'}),
(fido)-[:HAS_TOY]->(:Toy{name:'Banana'})

Simple EXISTS subquery


Variables introduced by the outside scope can be used in the EXISTS subquery without importing them. In
this regard, EXISTS subqueries are different from CALL subqueries, which do require importing. The
following example shows this:

MATCH (person:Person)
WHERE EXISTS {
(person)-[:HAS_DOG]->(:Dog)
}
RETURN person.name AS name

name

"Andy"

"Peter"

Rows: 2

EXISTS subquery with WHERE clause


A WHERE clause can be used in conjunction to the MATCH. Variables introduced by the MATCH clause and the
outside scope can be used in this scope.

220
MATCH (person:Person)
WHERE EXISTS {
MATCH (person)-[:HAS_DOG]->(dog:Dog)
WHERE person.name = dog.name
}
RETURN person.name AS name

name

"Andy"

Rows: 1

Nesting EXISTS subqueries


EXISTS subqueries can be nested like the following example shows. The nesting also affects the scopes.
That means that it is possible to access all variables from inside the subquery which are either from the
outside scope or defined in the very same subquery.

MATCH (person:Person)
WHERE EXISTS {
MATCH (person)-[:HAS_DOG]->(dog:Dog)
WHERE EXISTS {
MATCH (dog)-[:HAS_TOY]->(toy:Toy)
WHERE toy.name = 'Banana'
}
}
RETURN person.name AS name

name

"Peter"

Rows: 1

EXISTS subquery outside of a WHERE clause


EXISTS subquery expressions can appear anywhere that an expression is valid. Here the result is a boolean
that shows whether the subquery can find the given pattern.

MATCH (person:Person)
RETURN person.name AS name, EXISTS {
MATCH (person)-[:HAS_DOG]->(:Dog)
} AS hasDog

name hasDog

"Andy" true

"Timothy" false

"Peter" true

Rows: 3

221
EXISTS subquery with a UNION Label—new 5.3
Exists can be used with a UNION clause, and the RETURN clauses are not required. It is worth noting that if
one branch has a RETURN clause, then all branches require one. The below example demonstrates that if
one of the UNION branches was to return at least one row, the entire EXISTS expression will evaluate to
true.

MATCH (person:Person)
RETURN
person.name AS name,
EXISTS {
MATCH (person)-[:HAS_DOG]->(:Dog)
UNION
MATCH (person)-[:HAS_CAT]->(:Cat)
} AS hasPet

name hasPet

"Andy" true

"Timothy" true

"Peter" true

Rows: 3

EXISTS subquery with WITH Label—new 5.3


Variables from the outside scope are visible for the entire subquery, even when using a WITH clause. To
avoid confusion, shadowing of these variables is not allowed. An outside scope variable is shadowed
when a newly introduced variable within the inner scope is defined with the same variable. In the example
below, the outer variable name is shadowed and will therefore throw an error.

WITH 'Peter' as name


MATCH (person:Person {name: name})
WHERE EXISTS {
WITH "Ozzy" AS name
MATCH (person)-[:HAS_DOG]->(d:Dog)
WHERE d.name = name
}
RETURN person.name AS name

Error message

The variable `name` is shadowing a variable with the same name from the outer scope and needs to be
renamed (line 4, column 20 (offset: 90))

New variables can be introduced into the subquery, as long as they use a different identifier. In the
example below, a WITH clause introduces a new variable. Note that the outer scope variable person
referenced in the main query is still available after the WITH clause.

222
MATCH (person:Person)
WHERE EXISTS {
WITH "Ozzy" AS dogName
MATCH (person)-[:HAS_DOG]->(d:Dog)
WHERE d.name = dogName
}
RETURN person.name AS name

name

"Peter"

Rows: 1

EXISTS subquery with RETURN Label—new 5.3


EXISTS subqueries do not require a RETURN clause at the end of the subquery. If one is present, it does not
need to be aliased, which is different compared to CALL subqueries. Any variables returned in an EXISTS
subquery will not be available after the subquery.

MATCH (person:Person)
WHERE EXISTS {
MATCH (person)-[:HAS_DOG]->(:Dog)
RETURN person.name
}
RETURN person.name AS name

name

"Andy"

"Peter"

Rows: 2

Rules
The following is true for EXISTS subqueries:

• Any non-writing query is allowed.

• If the EXISTS subquery evaluates to at least one row, the whole expression will become true. This also
means that the system only needs to evaluate if there is at least one row and can skip the rest of the
work.

• EXISTS subqueries differ from regular queries in that the final RETURN clause may be omitted, as any
variable defined within the subquery will not be available outside of the expression, even if a final
RETURN clause is used.

• The MATCH keyword can be omitted in subqueries in cases where the EXISTS consists of only a pattern
and an optional WHERE clause.

• An EXISTS subquery can appear anywhere in a query that an expression is valid.

• Any variable that is defined in the outside scope can be referenced inside the subquery’s own scope.

• Variables introduced inside the subquery are not part of the outside scope and therefore cannot be

223
accessed on the outside.

224
Patterns
Graph pattern matching sits at the very core of Cypher. It is the mechanism used to navigate, describe and
extract data from a graph by applying a declarative pattern. Inside a MATCH clause, you can use graph
patterns to define the data you are searching for and the data to return. Graph pattern matching can also
be used without a MATCH clause, in the subqueries EXISTS, COUNT, and COLLECT.

A graph pattern describes data using a syntax that is similar to how the nodes and relationships of a
property graph are drawn on a whiteboard. On a whiteboard, nodes are drawn as circles and relationships
are drawn as arrows. Cypher represents the circles as a pair of parentheses, and the arrows as dashes and
greater-than or less-than symbols:

()-->()<--()

These simple patterns for nodes and relationships form the building blocks of path patterns that can match
paths of a fixed length. As well as discussing simple fixed-length patterns, this chapter covers more
complex patterns, showing how to match patterns of a variable or unknown length, find the shortest paths
between a given set of nodes, add inline filters for improved query performance, and add cycles and non-
linear shapes to path patterns.

This chapter includes the following sections:

• Primer - a short primer on how to get started with using graph pattern matching in Cypher.

• Fixed length patterns - information about node, relationship, and path patterns.

• Variable length patterns - information about quantified path patterns, quantified relationships, and
group variables.

• Shortest paths - information about finding the SHORTEST path patterns.

• Non-linear patterns - information about equijoins and graph patterns (combined path patterns).

• Syntax and semantics - a reference for looking up the syntax and semantics of graph pattern
matching.

The model data in the examples used in this chapter are based on the UK national rail network, using
publicly available datasets.

Primer
This section contains a primer covering some fundamental features of graph pattern matching with Cypher
queries.

Example graph
The example graph used in this tutorial is a model of train Stations, and different train services with Stops
that call at the Stations.

225
arrives: 22:55

arrives: 22:55 Stop


Stop

AT
CA

S_
LL

LL
S_

CA
AT

Station LONDON VICTORIA NE


XT

XT
dis L

NE
ta INK
nc
e:
3.
18

BRIXTON

.24
tan INK
arrives: 22:50

:1
Station

ce
L
departs: 22:50
Stop departs: 22:44

dis
Stop dis LI
ta N
nc K

AT
CA e:

S_
LL 1.
S_ 1

LL
1
AT WANDSWORTH CLAPHAM

CA
XT

ROAD HIGH STREET PECKHAM RYE


NE

LINK LINK LINK LINK


distance: 0.7 distance: 0.39 distance: 1.96 distance: 0.86
Station Station Station Station Station

Stop departs: 22:46 BATTERSEA PARK DENMARK HILL


.45
tan NK
:1

CALLS_AT

CALLS_AT

CALLS_AT

CALLS_AT

CALLS_AT
CA
dis LI
ce

LL
S_
AT

NEXT NEXT NEXT NEXT


Station Stop Stop Stop Stop Stop

CLAPHAM arrives: 22:48 departs: 22:43 departs: 22:41 departs: 22:37 departs: 22:33
JUNCTION arrives: 22:43 arrives: 22:41 arrives: 22:36

To recreate the graph, run the following query against an empty Neo4j database:

Query

CREATE (n1:Station {name: 'Denmark Hill'}),


(n5:Station {name: 'Battersea Park'}),
(n6:Station {name: 'Wandsworth Road'}),
(n15:Station {name: 'Clapham High Street'}),
(n16:Station {name: 'Peckham Rye'}),
(n17:Station {name: 'Brixton'}),
(n14:Station {name: 'London Victoria'}),
(n18:Station {name: 'Clapham Junction'}),
(p10:Stop {departs: time('22:37'), arrives: time('22:36')}),
(p0:Stop {departs: time('22:41'), arrives: time('22:41')}),
(p2:Stop {departs: time('22:43'), arrives: time('22:43')}),
(p17:Stop {arrives: time('22:50'), departs: time('22:50')}),
(p18:Stop {arrives: time('22:46'), departs: time('22:46')}),
(p19:Stop {departs: time('22:33'), arrives: time('22:31')}),
(p21:Stop {arrives: time('22:55')}),
(p20:Stop {departs: time('22:44'), arrives: time('22:43')}),
(p22:Stop {arrives: time('22:55')}),
(p23:Stop {arrives: time('22:48')}),
(n15)-[:LINK {distance: 1.96}]->(n1)-[:LINK {distance: 0.86}]->(n16),
(n15)-[:LINK {distance: 0.39}]->(n6)<-[:LINK {distance: 0.7}]-(n5)-[:LINK {distance: 1.24}]->(n14), (n5)-
[:LINK {distance: 1.45}]->(n18),
(n14)<-[:LINK {distance: 3.18}]-(n17)-[:LINK {distance: 1.11}]->(n1),
(p2)-[:CALLS_AT]->(n6), (p17)-[:CALLS_AT]->(n5), (p19)-[:CALLS_AT]->(n16),
(p22)-[:CALLS_AT]->(n14), (p18)-[:CALLS_AT]->(n18), (p0)-[:CALLS_AT]->(n15), (p23)-[:CALLS_AT]->(n5), (
p20)-[:CALLS_AT]->(n1),
(p21)-[:CALLS_AT]->(n14), (p10)-[:CALLS_AT]->(n1), (p19)-[:NEXT]->(p10)-[:NEXT]->(p0)-[:NEXT]->(p2)-[
:NEXT]->(p23),
(p22)<-[:NEXT]-(p17)<-[:NEXT]-(p18), (p21)<-[:NEXT]-(p20)

Matching fixed-length paths


An empty pair of parentheses is a node pattern that will match any node. This example gets a count of all
the nodes in the graph:

226
MATCH ()
RETURN count(*) AS numNodes

Result

numNodes

18

Rows: 1

Adding a label to the node pattern will filter on nodes with that label (see label expressions). The following
query gets a count of all the nodes with the label Stop:

MATCH (:Stop)
RETURN count(*) AS numStops

Result

numStops

10

Rows: 1

Path patterns can match relationships and the nodes they connect. The following query gets the arrival
time of all trains calling at Denmark Hill:

MATCH (s:Stop)-[:CALLS_AT]->(:Station {name: 'Denmark Hill'})


RETURN s.arrives AS arrivalTime

Result

arrivalTime

"22:36:00Z"

"22:43:00Z"

Rows: 2

Path patterns can include inline WHERE clauses. The following query gets the next calling point of the
service that departs Denmark Hill at 22:37:

MATCH (n:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-


(s:Stop WHERE s.departs = time('22:37'))-[:NEXT]->
(:Stop)-[:CALLS_AT]->(d:Station)
RETURN d.name AS nextCallingPoint

Result

nextCallingPoint

"Clapham High Street"

Rows: 1

227
For more information, see Fixed length patterns.

Matching variable-length paths


Variable-length paths that only traverse relationships with a specified type can be matched with quantified
relationships. Any variable declared in the relationship pattern will return a list of the relationships
traversed. The following query returns the total distance traveled via all LINKs connecting the stations
Peckham Rye and Clapham Junction:

Query

MATCH (:Station {name: 'Peckham Rye'})-[link:LINK]-+


(:Station {name: 'Clapham Junction'})
RETURN reduce(acc = 0.0, l IN link | round(acc + l.distance, 2)) AS
totalDistance

Result

totalDistance

7.84

5.36

Rows: 2

-[:LINK]-` is a <<quantified-relationships, quantified relationship>>. It is


composed of a relationship pattern `-[:LINK]-` that matches relationships going

 in either direction, and a quantifier ` that means it will match one or more
relationships. As no node patterns are included with quantified relationships, they will
match any intermediate nodes.

Variable-length paths can also be matched with quantified path patterns, which allow both WHERE clauses
and accessing the nodes traversed by the path. The following query returns a list of calling points on
routes from Peckham Rye to London Victoria, where no distance between stations is greater than two
miles:

Query

MATCH (:Station {name: 'Peckham Rye'})


(()-[link:LINK]-(s) WHERE link.distance <= 2)+
(:Station {name: 'London Victoria'})
UNWIND s AS station
RETURN station.name AS callingPoint

Result

callingPoint

"Denmark Hill"

"Clapham High Street"

"Wandsworth Road"

"Battersea Park"

"London Victoria"

228
callingPoint

Rows: 5

WHERE clauses inside node patterns can themselves include path patterns. The following query using an
EXISTS subquery to anchor on the last Stop in a sequence of Stops, and returns the departure times, arrival
times and final destination of all services calling at Denmark Hill:

Query

MATCH (:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-(s1:Stop)-[:NEXT]->+


(sN:Stop WHERE NOT EXISTS { (sN)-[:NEXT]->(:Stop) })-[:CALLS_AT]->
(d:Station)
RETURN s1.departs AS departure, sN.arrives AS arrival,
d.name AS finalDestination

Result

departure arrival finalDestination

'22:37:00Z' '22:48:00Z' "Battersea Park"

'22:44:00Z' '22:55:00Z' "London Victoria"

Rows: 2

Node variables declared inside quantified path patterns become bound to lists of nodes, which can be
unwound and used in subsequent MATCH clauses. The following query lists the calling points of the Peckham
Rye to Battersea Park train service:

Query

MATCH (:Station {name: 'Peckham Rye'})<-[:CALLS_AT]-(:Stop)


(()-[:NEXT]->(s:Stop))+
()-[:CALLS_AT]->(:Station {name: 'Battersea Park'})
UNWIND s AS stop
MATCH (stop)-[:CALLS_AT]->(station:Station)
RETURN stop.arrives AS arrival, station.name AS callingPoint

Result

arrival callingPoint

"22:36:00Z" "Denmark Hill"

"22:41:00Z" "Clapham High Street"

"22:43:00Z" "Wandsworth Road"

"22:48:00Z" "Battersea Park"

Rows: 4

Repeating a node variable in a path pattern enables the same node to be bound more than once in a path
(see equijoins). The following query finds all stations that are on a cycle (i.e., pass through the same
Station more than once) formed by the LINK between Stations:

229
Query

MATCH (n:Station)-[:LINK]-+(n)
RETURN DISTINCT n.name AS station

Result

station

"Denmark Hill"

"Battersea Park"

"Wandsworth Road"

"Clapham High Street"

"Brixton"

"London Victoria"

Rows: 6

Complex, non-linear paths can be matched using graph patterns, a comma separated list of path patterns
that are connected via repeated node variables, i.e. equijoins. For example, a passenger is traveling from
Denmark Hill and wants to join the train service to London Victoria that leaves Clapham Junction at
22:46. The following query finds the departure time from Denmark Hill as well as the changeover Station
and time of arrival:

Query

MATCH (:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-


(s1:Stop)-[:NEXT]->+(s2:Stop)-[:CALLS_AT]->
(c:Station)<-[:CALLS_AT]-(x:Stop),
(:Station {name: 'Clapham Junction'})<-[:CALLS_AT]-
(t1:Stop)-[:NEXT]->+(x)-[:NEXT]->+(:Stop)-[:CALLS_AT]->
(:Station {name: 'London Victoria'})
WHERE t1.departs = time('22:46')
AND s2.arrives < x.departs
RETURN s1.departs AS departure, s2.arrives AS changeArrival,
c.name AS changeAt

Result

departure changeArrival changeAt

"22:37:00Z" "22:48:00Z" "Battersea Park"

Rows: 1

For more information, see Variable length patterns.

Matching shortest paths


The shortest path between two nodes can be found using the SHORTEST keyword:

230
Query

MATCH p = SHORTEST 1
(:Station {name: "Brixton"})
(()-[:LINK]-(:Station))+
(:Station {name: "Clapham Junction"})
RETURN [station IN nodes(p) | station.name] AS route

Result

route

["Brixton", "London Victoria", "Battersea Park", "Clapham Junction"]

Rows: 1

To find all shortest paths, the ALL SHORTEST keywords can be used:

Query

MATCH p = ALL SHORTEST


(:Station {name: "Denmark Hill"})
(()-[:LINK]-(:Station))+
(:Station {name: "Clapham Junction"})
RETURN [station IN nodes(p) | station.name] AS route

Result

route

["Denmark Hill", "Clapham High Street", "Wandsworth Road", "Battersea Park", "Clapham Junction"]

["Denmark Hill", "Brixton", "London Victoria", "Battersea Park", "Clapham Junction"]

Rows: 2

In general, SHORTEST k can be used to return the k shortest paths. The following returns the two shortest
paths:

Query

MATCH p = SHORTEST 2
(:Station {name: "Denmark Hill"})
(()-[:LINK]-(:Station))+
(:Station {name: "Clapham High Street"})
RETURN [station IN nodes(p) | station.name] AS route

Result

route

["Denmark Hill", "Clapham High Street"]

["Denmark Hill", "Brixton", "London Victoria", "Battersea Park", "Wandsworth Road", "Clapham High Street"]

Rows: 2

For more information, see Shortest paths.

231
Fixed length patterns
The most basic form of graph pattern matching in Cypher involves the matching of fixed length patterns.
This includes node patterns, relationship patterns, and path patterns.

Node patterns
Every graph pattern contains at least one node pattern. The simplest graph pattern is a single, empty node
pattern:

MATCH ()

The empty node pattern matches every node in a property graph. In order to obtain a reference to the
nodes matched, a variable needs to be declared in the node pattern:

MATCH (n)

With this reference, node properties can be accessed:

MATCH (n)
RETURN n.name

Adding a label expression to the node pattern means only nodes with labels that match will be returned.
The following matches nodes that have the Stop label:

MATCH (n:Stop)

The following more complex label expression matches all nodes that are either a TrainStation and a
BusStation or StationGroup:

MATCH (n:(TrainStation&BusStation)|StationGroup)

A map of property names and values can be used to match on node properties based on equality with the
specified values. The following matches nodes that have their mode property equal to Rail:

MATCH (n { mode: 'Rail' })

More general predicates can be expressed with a WHERE clause. The following matches nodes whose name
property starts with Preston:

MATCH (n:Station WHERE n.name STARTS WITH 'Preston')

See the node patterns reference section for more details.

232
Relationship patterns
The simplest possible relationship pattern is a pair of dashes:

--

This pattern matches a relationship with any direction and does not filter on any relationship type or
property. Unlike a node pattern, a relationship pattern cannot be used in a MATCH clause without node
patterns at both ends. See path patterns for more details.

In order to obtain a reference to the relationships matched by the pattern, a relationship variable needs to
be declared in the pattern by adding the variable name in square brackets in between the dashes:

-[r]-

To match a specific direction, add < or > to the left or right hand side respectively:

-[r]->

To match on a relationship type, add the type name after a colon:

-[:CALLS_AT]->

Similar to node patterns, a map of property names and values can be added to filter on properties of the
relationship based on equality with the specified values:

-[{ distance: 0.24, duration: 'PT4M' }]->

A WHERE clause can be used for more general predicates:

-[r WHERE time() + duration(r.duration) < time('22:00') ]->

See the relationship patterns reference section for more details.

Path patterns
Any valid path starts and ends with a node, with relationships between each node (if there is more than
one node). Fixed length path patterns have the same restrictions, and for all valid path patterns the
following are true:

• They have at least one node pattern.

• They begin and end with a node pattern.

• They alternate between nodes and relationships.

These are all valid path patterns:

233
()

(s)--(e)

(:Station)--()<--(m WHERE m.departs > time('12:00'))-->()-[:NEXT]->(n)

These are invalid path patterns:

-->

()-->

()-->-->()

Path pattern matching


This section contains an example of matching a path pattern to paths in a property graph.

It uses the following graph:

name: London Victoria name: Elephant & Castle


departs: 11:54
CALLS_AT
Station Station Stop
Stop Station
CALLS_AT

NEXT

departs: 11:44
departs: 11:40
NEXT NEXT
Stop Stop Stop
arrives: 11:55
CALLS_AT

CALLS_AT

Stop departs: 11:47


AT
LL S_
CA
name: Clapham High Street
NEXT

Station Station Station name: Peckham Rye


LS_AT
name: Denmark Hill CAL
CALLS_AT

CALLS_AT
CALLS_AT

Stop
departs: 11:44

NEXT NEXT
Stop Stop Stop
departs: 11:41 departs: 11:33
departs: 11:37

To recreate the graph, run the following query against an empty Neo4j database:

234
CREATE (pmr:Station {name: 'Peckham Rye'}),
(dmk:Station {name: 'Denmark Hill'}),
(vic:Station {name: 'London Victoria'}),
(clp:Station {name: 'Clapham High Street'}),
(eph:Station {name: 'Elephant & Castle'}),
(vic)<-[:CALLS_AT]-(s1:Stop {departs: time('11:55')}),
(dmk)<-[:CALLS_AT]-(s2:Stop {departs: time('11:44')})-[:NEXT]->(s1),
(pmr)<-[:CALLS_AT]-(s3:Stop {departs: time('11:40')})-[:NEXT]->(s2),
(clp)<-[:CALLS_AT]-(s4:Stop {departs: time('11:41')}),
(dmk)<-[:CALLS_AT]-(s5:Stop {departs: time('11:37')})-[:NEXT]->(s4),
(pmr)<-[:CALLS_AT]-(s6:Stop {departs: time('11:33')})-[:NEXT]->(s5),
(eph)<-[:CALLS_AT]-(s7:Stop {departs: time('11:54')}),
(dmk)<-[:CALLS_AT]-(s8:Stop {departs: time('11:47')})-[:NEXT]->(s7),
(pmr)<-[:CALLS_AT]-(s9:Stop {departs: time('11:44')})-[:NEXT]->(s8)

The graph contains a number of train Stations and Stops. A Stop represents the arrival and departure of a
train that CALLS_AT a Station. Each Stop forms part of a sequence of Stops connected by relationships with
the type NEXT, representing the order of calling points made by a train service.

The graph shows three chains of Stops that represent different train services. Each of these services calls
at the Station with the name Denmark Hill.

To return all Stops that call at the Station Denmark Hill, the following motif is used (the term motif is used
to describe the pattern looked for in the graph):

name: Denmark Hill

(:Stop) -[:CALLS_AT]-> (:Station {name: "Denmark Hill"})

In this case, three paths in the graph match the structure of the motif (plus the predicate anchoring to the
Station Denmark Hill):

In order to return the name of each Stop that calls at a Station, declare a variable in the Stop node pattern.
The results will then have a row containing the departs value of each Stop for each match shown above:

Query

MATCH (s:Stop)-[:CALLS_AT]->(:Station {name: 'Denmark Hill'})


RETURN s.departs AS departureTime

235
Result

departureTime

"11:44:00Z"

"11:47:00Z"

"11:37:00Z"

Rows: 3

Variable length patterns


Cypher can be used to match patterns of a variable or an unknown length. Such patterns can be found
using quantified path patterns and quantified relationships. This page also discusses how variables work
when declared in quantified path patterns (group variables), and how to use predicates in quantified path
patterns.

Quantified path patterns Label—new 5.9


This section considers how to match paths of varying length by using quantified path patterns, allowing
you to search for paths whose lengths are unknown or within a specific range.

Quantified path patterns can be useful when, for example, searching for all nodes that can be reached
from an anchor node, finding all paths connecting two nodes, or when traversing a hierarchy that may
have differing depths.

This example uses a new graph:

departs: 17:13 departs: 17:01

departs: 17:07 XT
NE
XT
NE NE
CALL

arrives: 17:19 XT
CALLS

departs: 17:11
S_AT

NEXT
_AT

CAL
CALLS_A

LS_A
CALLS

T
_AT
T

name: Wandsworth Road name: Peckham Rye

name: Denmark Hill


name: Clapham Junction

Station
name: Clapham High Street
AT
CALLS_A

LS_
CAL

Stop
T

NEXT

arrives: 17:17 departs: 17:10

To recreate the graph, run the following query against an empty Neo4j database:

236
CREATE (pmr:Station {name: 'Peckham Rye'}),
(dmk:Station {name: 'Denmark Hill'}),
(clp:Station {name: 'Clapham High Street'}),
(wwr:Station {name: 'Wandsworth Road'}),
(clj:Station {name: 'Clapham Junction'}),
(s1:Stop {arrives: time('17:19'), departs: time('17:20')}),
(s2:Stop {arrives: time('17:12'), departs: time('17:13')}),
(s3:Stop {arrives: time('17:10'), departs: time('17:11')}),
(s4:Stop {arrives: time('17:06'), departs: time('17:07')}),
(s5:Stop {arrives: time('16:58'), departs: time('17:01')}),
(s6:Stop {arrives: time('17:17'), departs: time('17:20')}),
(s7:Stop {arrives: time('17:08'), departs: time('17:10')}),
(clj)<-[:CALLS_AT]-(s1), (wwr)<-[:CALLS_AT]-(s2),
(clp)<-[:CALLS_AT]-(s3), (dmk)<-[:CALLS_AT]-(s4),
(pmr)<-[:CALLS_AT]-(s5), (clj)<-[:CALLS_AT]-(s6),
(dmk)<-[:CALLS_AT]-(s7),
(s5)-[:NEXT {distance: 1.2}]->(s4),(s4)-[:NEXT {distance: 0.34}]->(s3),
(s3)-[:NEXT {distance: 0.76}]->(s2), (s2)-[:NEXT {distance: 0.3}]->(s1),
(s7)-[:NEXT {distance: 1.4}]->(s6)

Each Stop on a service CALLS_AT one Station. Each Stop has the properties arrives and departs that give
the times the train is at the Station. Following the NEXT relationship of a Stop will give the next Stop of the
service.

For this example, a path pattern is constructed to match each of the services that allow passengers to
travel from Denmark Hill to Clapham Junction. The following shows the two paths that the path pattern
should match:

The following motif represents a fixed-length path pattern that matches the service that departs from
Denmark Hill station at 17:07:

CALLS_ AT NEXT NEXT NEXT CALLS_AT


CallingPoint CallingPoint CallingPoint CallingPoint

To match the second train service, leaving Denmark Hill at 17:10, a shorter path pattern is needed:

CALLS_ AT NEXT CALLS_AT


CallingPoint CallingPoint

Translating the motifs into Cypher, and adding predicates to match the origin and destination Stations,
yields the following two path patterns respectively:

237
(:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(:Stop)
-[:NEXT]->(:Stop)
-[:NEXT]->(:Stop)
-[:NEXT]->(:Stop)-[:CALLS_AT]->
(:Station { name: 'Clapham Junction' })

(:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(:Stop)


-[:NEXT]->(:Stop)-[:CALLS_AT]->
(:Station { name: 'Clapham Junction' })

To return both solutions in the same query using these fixed-length path patterns, a UNION of two MATCH
statements would be needed. For example, the following query returns the departure of the two services:

Query

MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)


-[:NEXT]->(:Stop)
-[:NEXT]->(:Stop)
-[:NEXT]->(a:Stop)-[:CALLS_AT]->
(:Station { name: 'Clapham Junction' })
RETURN d.departs AS departureTime, a.arrives AS arrivalTime
UNION
MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)
-[:NEXT]->(a:Stop)-[:CALLS_AT]->
(:Station { name: 'Clapham Junction' })
RETURN d.departs AS departureTime, a.arrives AS arrivalTime

Result

departureTime arrivalTime

"17:07:00Z" "17:19:00Z"

"17:10:00Z" "17:17:00Z"

Rows: 2

The problem with this solution is that not only is it verbose, it can only be used where the lengths of the
target paths are known in advance. Quantified path patterns solve this problem by extracting repeating
parts of a path pattern into parentheses and applying a quantifier. That quantifier specifies a range of
possible repetitions of the extracted pattern to match on. For the current example, the first step is
identifying the repeating pattern, which in this case is the sequence of alternating Stop nodes and NEXT
relationships, representing one segment of a Service:

(:Stop)-[:NEXT]->(:Stop)

The shortest path has one instance of this pattern, the longest three. So the quantifier applied to the
wrapper parentheses is the range one to three, expressed as {1,3}:

((:Stop)-[:NEXT]->(:Stop)){1,3}

This also includes repetitions of two, but in this case this repetition will not return matches. To understand
the semantics of this pattern, it helps to work through the expansion of the repetitions. Here are the three
repetitions specified by the quantifier, combined into a union of path patterns:

238
(:Stop)-[:NEXT]->(:Stop) |
(:Stop)-[:NEXT]->(:Stop)(:Stop)-[:NEXT]->(:Stop) |
(:Stop)-[:NEXT]->(:Stop)(:Stop)-[:NEXT]->(:Stop)(:Stop)-[:NEXT]->(:Stop)

The union operator (|) and placing node patterns next to each other are used here for illustration only;
using it this way is not part of Cypher syntax. Where two node patterns are next to each other in the
expansion above, they must necessarily match the same node: the next segment of a Service starts where
the previous segment ends. As such they can be rewritten as a single node pattern with any filtering
condition combined conjunctively. In this example this is trivial, because the filtering applied to those nodes
is just the label Stop:

With this, the union of path patterns simplifies to:

(:Stop)-[:NEXT]->(:Stop) |
(:Stop)-[:NEXT]->(:Stop)-[:NEXT]->(:Stop) |
(:Stop)-[:NEXT]->(:Stop)-[:NEXT]->(:Stop)-[:NEXT]->(:Stop)

The segments of the original path pattern that connect the Stations to the Stops can also be rewritten.
Here is what those segments look like when concatenated with the first repetition:

(:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(:Stop)


(:Stop)-[:NEXT]->(:Stop)
(:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })

The original MATCH clause now has the following three parts:

Translating the union of fixed-length path patterns into a quantified path pattern results in a pattern that
will return the correct paths. The following query adds a RETURN clause that yields the departure and arrival
times of the two services:

Query

MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)


((:Stop)-[:NEXT]->(:Stop)){1,3}
(a:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })
RETURN d.departs AS departureTime, a.arrives AS arrivalTime

239
Result

departureTime arrivalTime

"17:10Z" "17:17Z"

"17:07Z" "17:19Z"

Rows: 2

Quantified relationships Label—new 5.9


Quantified relationships allow some simple quantified path patterns to be re-written in a more succinct
way. Continuing with the example of Stations and Stops from the previous section, consider the following
query:

Query

MATCH (d:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(n:Stop)


((:Stop)-[:NEXT]->(:Stop)){1,10}
(m:Stop)-[:CALLS_AT]->(a:Station { name: 'Clapham Junction' })
WHERE m.arrives < time('17:18')
RETURN n.departs AS departureTime

If the relationship NEXT only connects Stop nodes, the :Stop label expressions can be removed:

Query

MATCH (d:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(n:Stop)


(()-[:NEXT]->()){1,10}
(m:Stop)-[:CALLS_AT]->(a:Station { name: 'Clapham Junction' })
WHERE m.arrives < time('17:18')
RETURN n.departs AS departureTime

When the quantified path pattern has one relationship pattern, it can be abbreviated to a quantified
relationship. A quantified relationship is a relationship pattern with a postfix quantifier. Below is the
previous query rewritten with a quantified relationship:

Query

MATCH (d:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-


(n:Stop)-[:NEXT]->{1,10}(m:Stop)-[:CALLS_AT]->
(a:Station { name: 'Clapham Junction' })
WHERE m.arrives < time('17:18')
RETURN n.departs AS departureTime

The scope of the quantifier {1,10} is the relationship pattern -[:NEXT]-> and not the node patterns
abutting it. More generally, where a path pattern contained in a quantified path pattern has the following
form:

(() <relationship pattern> ()) <quantifier>

then it can be re-written as follows:

<relationship pattern> <quantifier>

240
Prior to the introduction of quantified path patterns and quantified relationships in Neo4j
5.9, the only method in Cypher to match paths of a variable length was through variable-
length relationships. This syntax is still available but it is not GQL conformant. It is very
similar to the syntax for quantified relationships, with the following differences:

• Position and syntax of quantifier.


 • Semantics of the asterisk symbol.

• Type expressions are limited to the disjunction operator.

• The WHERE clause is not allowed.

For more information, see the reference section on variable-length relationships.

Group variables
This section uses the example of Stations and Stops used in the previous section, but with an additional
property distance added to the NEXT relationships:

departs: 17:01
departs: 17:13
Stop

Stop

3 .2
ce: 0. departs: 17:07
e: 1
tan nc
arrives: 17:19 dis dis
tan dista
ce departs: 17:11
:0
Stop

.76
4
ce: 0.3
Stop

Stop
distan

Station Station

Station
name: Denmark Hill
Station
name: Clapham Junction Station

distance: 1.4
Stop Stop

arrives: 17:17 departs: 17:10

As the name suggests, this property represents the distance between two Stops. To return the total
distance for each service connecting a pair of Stations, a variable referencing each of the relationships
traversed is needed. Similarly, to extract the departs and arrives properties of each Stop, variables
referencing each of the nodes traversed is required. In this example of matching services between Denmark
Hill and Clapham Junction, the variables l and m are declared to match the Stops and r is declared to
match the relationships. The variable origin only matches the first Stop in the path:

MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(origin)


((l)-[r:NEXT]->(m)){1,3}
()-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })

241
Variables that are declared inside quantified path patterns are known as group variables. They are so
called because, when referred outside of the quantified path pattern, they are lists of the nodes or
relationships they are bound to in the match. To understand how to think about the way group variables
are bound to nodes or relationships, it helps to expand the quantified path pattern, and observe how the
different variables match to the elements of the overall matched path. Here the three different expansions
for each value in the range given by the quantifier {1,3}:

(l1)-[r1:NEXT]->(m1) |
(l1)-[r1:NEXT]->(m1)(l2)-[r2:NEXT]->(m2) |
(l1)-[r1:NEXT]->(m1)(l2)-[r2:NEXT]->(m2)(l3)-[r3:NEXT]->(m3)

The subscript of each variable indicates which instance of the path pattern repetition they belong to. The
following diagram shows the variable bindings of the path pattern with three repetitions, which matches
the service that departs Denmark Hill at 17:07. It traces the node or relationship that each indexed variable
is bound to. Note that the index increases from right to left as the path starts at Denmark Hill:

For this matched path, the group variables have the following bindings:

l => [n2, n3, n4]


r => [r2, r3, r4]
m => [n3, n4, n5]

The second solution is the following path:

242
name: Denmark Hill
name: Clapham Junction

n1
n6

r6
r7

n8 n7
r8
distance: 1.4
departs: 17:17 departs: 17:10

The following table shows the bindings for both matches, including the variable origin. In contrast to the
group variables, origin is a singleton variable due to being declared outside the quantification. Singleton
variables bind at most to one node or relationship.

origin l r m

n2 [n2, n3, n4] [r2, r3, r4] [n3, n4, n5]

n7 [n7] [r8] [n8]

Returning to the original goal, which was to return the sequence of depart times for the Stops and the total
distance of each service, the final query exploits the compatibility of group variables with list
comprehensions and list functions such as reduce():

Query

MATCH (:Station {name: 'Denmark Hill'})<-[:CALLS_AT]-(origin)


((l)-[r:NEXT]->(m)){1,3}
()-[:CALLS_AT]->(:Station {name: 'Clapham Junction'})
RETURN origin.departs + [stop in m | stop.departs] AS departureTimes,
reduce(acc = 0.0, next in r | round(acc + next.distance, 2)) AS totalDistance

Result

departureTimes totalDistance

["17:10:00Z", "17:20:00Z"] 1.4

["17:07:00Z", "17:11:00Z", "17:13:00Z", "17:20:00Z"] 1.4

Rows: 2

Predicates in quantified path patterns


One of the pitfalls of quantified path patterns is that, depending on the graph, they can end up matching
very large numbers of paths, resulting in a slow query performance. This is especially true when searching
for paths with a large maximum length or when the pattern is too general. However, by using inline
predicates that specify precisely which nodes and relationships should be included in the results,
unwanted results will be pruned as the graph is traversed.

Here are some examples of the types of constraints you can impose on quantified path pattern traversals:

243
• Nodes must have certain combinations of labels. For example, all nodes must be an Employee, but not a
Contractor.

• Relationships must have certain types. For example, all relationships in the path must be of type
EMPLOYED_BY.

• Nodes or relationships must have properties satisfying some condition. For example, all relationships
must have the property distance > 10.

To demonstrate the utility of predicates in quantified path patterns, this section considers an example of
finding the shortest path by physical distance and compares that to the results yielded by using the
SHORTEST keyword. The graph in this example continues with Station nodes, but adds both a geospatial
location property to the Stations, as well as LINK relationships with a distance property representing the
distance between pairs of Stations:

London
Blackfriars

1.1
3
1.21

London
Bridge

1.8
Elephant
& Castle

South
Bermondsey

01
2.6

2.
0.95

Loughborough Queens Rd
Denmark Hill
Jn 0.86 Peckham
Peckham 0.71
Rye
1
1.1

84
0.
0.88

Brixton
East Dulwich
0.5
1

53
0.
Herne Hill

North
Dulwich
1.08

29
1.

Tulse Hill

To recreate the graph, run the following query against an empty Neo4j database:

244
CREATE (lbg:Station {name: "London Bridge"}),
(bfr:Station {name: "London Blackfriars"}),
(eph:Station {name: "Elephant & Castle"}),
(dmk:Station {name: "Denmark Hill"}),
(pmr:Station {name: "Peckham Rye"}),
(qrp:Station {name: "Queens Rd Peckham"}),
(sbm:Station {name: "South Bermondsey"}),
(lgj:Station {name: "Loughborough Jn"}),
(hnh:Station {name: "Herne Hill"}),
(tuh:Station {name: "Tulse Hill"}),
(ndl:Station {name: "North Dulwich"}),
(edw:Station {name: "East Dulwich"}),
(brx:Station {name: "Brixton"})
SET lbg.location = point({longitude: -0.08609, latitude: 51.50502}),
bfr.location = point({longitude: -0.10333, latitude: 51.51181}),
eph.location = point({longitude: -0.09873, latitude: 51.49403}),
dmk.location = point({longitude: -0.08936, latitude: 51.46820}),
pmr.location = point({longitude: -0.06941, latitude: 51.47003}),
qrp.location = point({longitude: -0.05731, latitude: 51.47357}),
sbm.location = point({longitude: -0.05468, latitude: 51.48814}),
lgj.location = point({longitude: -0.10218, latitude: 51.46630}),
hnh.location = point({longitude: -0.10229, latitude: 51.45331}),
tuh.location = point({longitude: -0.10508, latitude: 51.43986}),
ndl.location = point({longitude: -0.08792, latitude: 51.45451}),
edw.location = point({longitude: -0.08057, latitude: 51.46149}),
brx.location = point({longitude: -0.11418, latitude: 51.46330})
CREATE (lbg)<-[:LINK {distance: 1.13}]-(bfr),
(bfr)<-[:LINK {distance: 1.21}]-(eph),
(eph)-[:LINK {distance: 2.6}]->(dmk),
(dmk)-[:LINK {distance: 0.86}]->(pmr),
(pmr)-[:LINK {distance: 0.71}]->(qrp),
(qrp)<-[:LINK {distance: 0.95}]-(sbm),
(sbm)<-[:LINK {distance: 1.8}]-(lbg),
(lgj)-[:LINK {distance: 0.88}]->(hnh),
(hnh)-[:LINK {distance: 1.08}]->(tuh),
(tuh)<-[:LINK {distance: 1.29}]-(ndl),
(ndl)-[:LINK {distance: 0.53}]->(edw),
(edw)-[:LINK {distance: 0.84}]->(pmr),
(eph)-[:LINK {distance: 2.01}]->(lgj),
(dmk)-[:LINK {distance: 1.11}]->(brx),
(brx)-[:LINK {distance: 0.51}]->(hnh)

The following query finds the path length and total distance for ALL SHORTEST paths between London
Blackfriars to North Dulwich:

Query

MATCH (bfr:Station {name: 'London Blackfriars'}),


(ndl:Station {name: 'North Dulwich'})
MATCH p = ALL SHORTEST (bfr)-[:LINK]-+(ndl)
RETURN [n in nodes(p) | n.name] AS stops,
length(p) as stopCount,
reduce(acc = 0, r in relationships(p) | round(acc + r.distance, 2)) AS distance

Result

stops stopCount distance

["London Blackfriars", "Elephant & Castle", "Denmark Hill", "Peckham 5 6.04


Rye", "East Dulwich", "North Dulwich"]

["London Blackfriars", "Elephant & Castle", "Loughborough Jn", "Herne 5 6.47


Hill", "Tulse Hill", "North Dulwich"]

Rows: 2

ALL SHORTEST finds all shortest paths by number of hops, and as the result shows, there are two paths in
the graph tied for the shortest path. Whether any of these paths corresponds to the shortest path by

245
distance can be checked by looking at each path between the two end Stations and returning the first
result after ordering by distance:

Query

MATCH (bfr:Station {name: 'London Blackfriars'}),


(ndl:Station {name: 'North Dulwich'})
MATCH p = (bfr)-[:LINK]-+(ndl)
RETURN reduce(acc = 0, r in relationships(p) | round(acc + r.distance, 2))
AS distance
ORDER BY distance LIMIT 1

Result

distance

5.96

Rows: 1

This shows that there is a route with a shorter distance than either of the paths with fewer Stations
returned using ALL SHORTEST. But to get this result, the query had to first find all paths from London
Blackfriars to North Dulwich before it could select the shortest one. The following query shows the
number of possible paths:

Query

MATCH (bfr:Station {name: 'London Blackfriars'}),


(ndl:Station {name: 'North Dulwich'})
MATCH p = (bfr)-[:LINK]-+(ndl)
RETURN count(*) AS numPaths

Result

numPaths

Rows: 1

For a small dataset like this, finding all the paths will be fast. But as the size of the graph grows, the
execution time will increase exponentially. For a real dataset, such as the entire rail network of the UK, it
might be unacceptably long.

One approach to avoid the exponential explosion in paths is to put a finite upper bound to the quantified
path pattern (e.g. {,10}) to limit the amount of path iterations returned. This works fine where the solution
is known to lie within some range of hops. But in cases where this is not known, one alternative would be
to make the pattern more specific by, for example, adding node labels, or by specifying a relationship
direction. Another alternative would be to add an inline predicate to the quantified path pattern.

In this example, an inline predicate can be added that takes advantage of the geospatial location property
of the Stations: for each pair of Stations on the path, the second Station will be closer to the endpoint
(not always true, but is assumed here to keep the example simple). To compose the predicate, the
point.distance() function is used to compare the distance between the left-hand Station (a) and the right-
hand Station (b) for each node-pair along the path to the destination North Dulwich:

246
Query

MATCH (bfr:Station {name: "London Blackfriars"}),


(ndl:Station {name: "North Dulwich"})
MATCH p = (bfr)
((a)-[:LINK]-(b:Station)
WHERE point.distance(a.location, ndl.location) >
point.distance(b.location, ndl.location))+ (ndl)
RETURN reduce(acc = 0, r in relationships(p) | round(acc + r.distance, 2))
AS distance

Result

distance

5.96

Rows: 1

This query avoids having to find all possible paths and then imposing a LIMIT 1 to find the shortest one by
distance. It also shows that there is only one path to solving the query (a number that remains constant
even if the data from the rest of the UK railway network was included). Using inline predicates or making
quantified path patterns more specific where possible can thus greatly improve query performance.

Shortest paths
The Cypher keyword SHORTEST is used to find variations of the shortest paths between nodes. This
includes the ability to look for the shortest, second-shortest (and so on) paths, all available shortest paths,
and groups of paths containing the same pattern length. The ANY keyword, which can be used to test the
reachability of nodes from a given node(s), is also explained, as is how to apply filters in queries using
SHORTEST.

SHORTEST functionally replaces and extends the shortestPath() and allShortestPaths()


functions. Both functions can still be used, but they are not GQL conformant. For more
 information, see Syntax and semantics → The shortestPath() and allShortestPaths()
functions.

Note on Cypher and GDS shortest paths


Both Cypher and Neo4j´s Graph Data Science (GDS) library can be used to find variations of the shortest
paths between nodes.

Use Cypher if:

• You need to specify complex graph navigation via quantified path patterns.

• Creating a graph projection takes too long.

• GDS is not available in your instance, or the size of the GDS projection is too large for your instance.

Use GDS if:

• You need to compute a weighted shortest path.

247
• You need a specific algorithm like A* or Yen’s.

• You need to transform the graph with a projection before finding shortest path.

• You need to use shortest paths in conjunction with other GDS algorithms in the pipeline.

To read more about the shortest path algorithms included in the GDS library, see GDS Graph algorithms →
Path finding.

SHORTEST k
This section uses the following graph:

Hartlebury

5.6
4 Bromsgrove
6.16
Droitwich
Spa

6.03

.6
76

Worcester 12
5.

Foregate
Street
0.65

Worcester
Shrub Hill
3.71
Worcestershire
Parkway
4.16

31.14
Pershore
14
.75
11
.29

Ashchurch
7.2
5

Cheltenham
Spa

To recreate it, run the following query against an empty Neo4j database:

248
CREATE (asc:Station {name:"Ashchurch"}),
(bmv:Station {name:"Bromsgrove"}),
(cnm:Station {name:"Cheltenham Spa"}),
(dtw:Station {name:"Droitwich Spa"}),
(hby:Station {name:"Hartlebury"}),
(psh:Station {name:"Pershore"}),
(wop:Station {name:"Worcestershire Parkway"}),
(wof:Station {name:"Worcester Foregate Street"}),
(wos:Station {name:"Worcester Shrub Hill"})
CREATE (asc)-[:LINK {distance: 7.25}]->(cnm),
(asc)-[:LINK {distance: 11.29}]->(wop),
(asc)-[:LINK {distance: 14.75}]->(wos),
(bmv)-[:LINK {distance: 31.14}]->(cnm),
(bmv)-[:LINK {distance: 6.16}]->(dtw),
(bmv)-[:LINK {distance: 12.6}]->(wop),
(dtw)-[:LINK {distance: 5.64}]->(hby),
(dtw)-[:LINK {distance: 6.03}]->(wof),
(dtw)-[:LINK {distance: 5.76}]->(wos),
(psh)-[:LINK {distance: 4.16}]->(wop),
(wop)-[:LINK {distance: 3.71}]->(wos),
(wof)-[:LINK {distance: 0.65}]->(wos)

The paths matched by a path pattern can be restricted to only the shortest (by number of hops) by
including the keyword SHORTEST k, where k is the number of paths to match. For example, the following
example uses SHORTEST 1 to return the length of the shortest path between Worcester Shrub Hill and
Bromsgrove:

Query

MATCH p = SHORTEST 1 (wos:Station)-[:LINK]-+(bmv:Station)


WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN length(p) AS result

Note that this and the following examples in this section use a quantified relationship
-[:LINK]-`, which is composed of a relationship pattern `-[:LINK]-` and a
postfix quantifier `. The relationship pattern is only concerned with following
relationships with type LINK, and will otherwise traverse any node along the way. There

 is no arrowhead < or > on the relationship pattern, allowing the pattern to match
relationships going in either direction. This represents the fact that trains can go in both
directions along the LINK relationships between Stations. The + quantifier means that
one or more relationships should be matched. For more information, see Syntax and
semantics - quantified relationships.

Result

result

Rows: 1

Although the query returned a single result, there are in fact two paths that are tied for shortest:

249
Because 1 was specified in SHORTEST, only one of the paths is returned. Which one is returned is non-
deterministic.

If instead SHORTEST 2 is specified, all shortest paths in this example would be returned, and the result
would be deterministic:

Query

MATCH p = SHORTEST 2 (wos:Station)-[:LINK]-+(bmv:Station)


WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN [n in nodes(p) | n.name] AS stops

Result

stops

["Worcester Shrub Hill", "Droitwich Spa", "Bromsgrove"]

["Worcester Shrub Hill", "Worcestershire Parkway", "Bromsgrove"]

Rows: 2

Increasing the number of paths will return the next shortest paths. Three paths are tied for the second
shortest:

250
The following query returns all three of the second shortest paths, along with the two shortest paths:

Query

MATCH p = SHORTEST 5 (wos:Station)-[:LINK]-+(bmv:Station)


WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN [n in nodes(p) | n.name] AS stops

Result

stops

["Worcester Shrub Hill", "Droitwich Spa", "Bromsgrove"]

["Worcester Shrub Hill", "Worcestershire Parkway", "Bromsgrove"]

["Worcester Shrub Hill", "Worcester Foregate Street", "Droitwich Spa", "Bromsgrove"]

["Worcester Shrub Hill", "Ashchurch", "Worcestershire Parkway", "Bromsgrove"]

["Worcester Shrub Hill", "Ashchurch", "Cheltenham Spa", "Bromsgrove"]

Rows: 5

If there had been only four possible paths between the two Stations, then only those four would have been
returned.

ALL SHORTEST
To return all paths that are tied for shortest length, use the keywords ALL SHORTEST:

251
Query

MATCH p = ALL SHORTEST (wos:Station)-[:LINK]-+(bmv:Station)


WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN [n in nodes(p) | n.name] AS stops

Result

stops

["Worcester Shrub Hill", "Droitwich Spa", "Bromsgrove"]

["Worcester Shrub Hill", "Worcestershire Parkway", "Bromsgrove"]

Rows: 2

SHORTEST k GROUPS
To return all paths that are tied for first, second, and so on up to the kth shortest length, use SHORTEST k
GROUPS. For example, the following returns the first and second shortest length paths between Worcester
Shrub Hill and Bromsgrove:

Query

MATCH p = SHORTEST 2 GROUPS (wos:Station)-[:LINK]-+(bmv:Station)


WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN [n in nodes(p) | n.name] AS stops, length(p) AS pathLength

Result

stops pathLength

["Worcester Shrub Hill", "Droitwich Spa", "Bromsgrove"] 2

["Worcester Shrub Hill", "Worcestershire Parkway", "Bromsgrove"] 2

["Worcester Shrub Hill", "Worcester Foregate Street", "Droitwich Spa", 3


"Bromsgrove"]

["Worcester Shrub Hill", "Ashchurch", "Worcestershire Parkway", 3


"Bromsgrove"]

["Worcester Shrub Hill", "Ashchurch", "Cheltenham Spa", "Bromsgrove"] 3

Rows: 5

The first group includes the two shortest paths with pathLength = 2 (as seen in the first two rows of the
results), and the second group includes the three second shortest paths with pathLength = 3 (as seen in
the last three rows of the results).

If more groups are specified than exist in the graph, only those paths that exist are returned. For example,
if the paths equal to one of the eight shortest paths are specified for Worcester Shrub Hill to Bromsgrove,
only seven groups are returned:

Query

MATCH p = SHORTEST 8 GROUPS (wos:Station)-[:LINK]-+(bmv:Station)


WHERE wos.name = "Worcester Shrub Hill" AND bmv.name = "Bromsgrove"
RETURN length(p) AS pathLength, count(*) AS numPaths

252
Result

pathLength numPaths

2 2

3 3

4 1

5 4

6 8

7 10

8 6

Rows: 7

ANY
The ANY keyword can be used to test the reachability of nodes from a given node(s). It returns the same as
SHORTEST 1, but by using the ANY keyword the intent of the query is clearer. For example, the following
query shows that there exists a route from Pershore to Bromsgrove where the distance between each pair
of stations is less than 10 miles:

Query

MATCH path = ANY


(:Station {name: 'Pershore'})-[l:LINK WHERE l.distance < 10]-+(b:Station {name: 'Bromsgrove'})
RETURN [r IN relationships(path) | r.distance] AS distances

Result

distances

[4.16, 3.71, 5.76, 6.16]

Rows: 1

Partitions
When there are multiple start or end nodes matching a path pattern, the matches are partitioned into
distinct pairs of start and end nodes prior to selecting the shortest paths; a partition is one distinct pair of
start node and end node. The selection of shortest paths is then done from all paths that join the start and
end node of a given partition. The results are then formed from the union of all the shortest paths found for
each partition.

For example, if the start nodes of matches are bound to either Droitwich Spa or Hartlebury, and the end
nodes are bound to either Ashchurch or Cheltenham Spa, there will be four distinct pairs of start and end
nodes, and therefore four partitions:

Start node End node

Droitwich Spa Ashchurch

253
Start node End node

Droitwich Spa Cheltenham Spa

Hartlebury Ashchurch

Hartlebury Cheltenham Spa

The following query illustrates how these partitions define the sets of results within which the shortest
paths are selected. It uses a pair of UNWIND clauses to generate a Cartesian product of the names of the
Stations (all possible pairs of start node and end node), followed by the MATCH clause to find the shortest
two groups of paths for each pair of distinct start and end Stations:

Query

UNWIND ["Droitwich Spa", "Hartlebury"] AS a


UNWIND ["Ashchurch", "Cheltenham Spa"] AS b
MATCH SHORTEST 2 GROUPS (o:Station {name: a})-[l]-+(d:Station {name: b})
RETURN o.name AS start, d.name AS end,
size(l) AS pathLength, count(*) AS numPaths
ORDER BY start, end, pathLength

Result

start end pathLength numPaths

"Droitwich Spa" "Ashchurch" 2 1

"Droitwich Spa" "Ashchurch" 3 4

"Droitwich Spa" "Cheltenham Spa" 2 1

"Droitwich Spa" "Cheltenham Spa" 3 1

"Hartlebury" "Ashchurch" 3 1

"Hartlebury" "Ashchurch" 4 4

"Hartlebury" "Cheltenham Spa" 3 1

"Hartlebury" "Cheltenham Spa" 4 1

Rows: 8

Each partition appears twice: once for the group of shortest paths and once for the group of second
shortest paths. For example, for the partition of Droitwich Spa as the start and Ashchurch as the end, the
shortest path group (paths with length 2) has one path, and the second shortest path group (paths with
length 3) has four paths.

Pre-filters and post-filters


The position of a filter in a shortest path query will affect whether it is applied before or after selecting the
shortest paths. To see the difference, first consider a query that returns the shortest path from Hartlebury
to Cheltenham Spa:

254
Query

MATCH SHORTEST 1
(:Station {name: 'Hartlebury'})
(()--(n))+
(:Station {name: 'Cheltenham Spa'})
RETURN [stop in n[..-1] | stop.name] AS stops

Result

stops

["Droitwich Spa", "Bromsgrove"]

Rows: 1

Note that n[..-1] is a slicing operation that returns all elements of n except the last. If instead, the query
uses a WHERE clause at the MATCH level to filter out routes that go via Bromsgrove, the filtering is applied
after the shortest paths are selected. This results in the only solution being removed, and no results being
returned:

Query

MATCH SHORTEST 1
(:Station {name: 'Hartlebury'})
(()--(n:Station))+
(:Station {name: 'Cheltenham Spa'})
WHERE none(stop IN n[..-1] WHERE stop.name = 'Bromsgrove')
RETURN [stop in n[..-1] | stop.name] AS stops

Result

stops

Rows: 0

There are two ways to turn a post-filter without solutions into a pre-filter that returns solutions. One is to
inline the predicate into the path pattern:

Query

MATCH SHORTEST 1
(:Station {name: 'Hartlebury'})
(()--(n:Station WHERE n.name <> 'Bromsgrove'))+
(:Station {name: 'Cheltenham Spa'})
RETURN [stop in n[..-1] | stop.name] AS stops

Result

stops

["Droitwich Spa", "Worcester Shrub Hill", "Ashchurch"]

Rows: 1

The shortest journey that avoids Bromsgrove is now returned.

An alternative is to wrap the path pattern and filter in parentheses (leaving the SHORTEST keyword on the
outside):

255
Query

MATCH SHORTEST 1
( (:Station {name: 'Hartlebury'})
(()--(n:Station))+
(:Station {name: 'Cheltenham Spa'})
WHERE none(stop IN n[..-1] WHERE stop.name = 'Bromsgrove') )
RETURN [stop in n[..-1] | stop.name] AS stops

Result

stops

["Droitwich Spa", "Worcester Shrub Hill", "Ashchurch"]

Rows: 1

Pre-filter with a path variable


The previous section showed how to apply a filter before the shortest path selection by the use of
parentheses. Placing a path variable declaration before the shortest path keywords, however, places it
outside the scope of the parentheses. To reference a path variable in a pre-filter, it has to be declared
inside the parentheses.

To illustrate, consider this example that returns all shortest paths from Hartlebury to each of the other
Stations:

Query

MATCH p = SHORTEST 1 (:Station {name: 'Hartlebury'})--+(b:Station)


RETURN b.name AS destination, length(p) AS pathLength
ORDER BY pathLength, destination

Result

destination pathLength

"Droitwich Spa" 1

"Bromsgrove" 2

"Worcester Foregate Street" 2

"Worcester Shrub Hill" 2

"Ashchurch" 3

"Cheltenham Spa" 3

"Worcestershire Parkway" 3

"Pershore" 4

Rows: 8

If the query is altered to only include routes that have an even number of stops, adding a WHERE clause at
the MATCH level will not work, because it would be a post-filter. It would return the results of the previous
query with all routes with an odd number of stops removed:

256
Query

MATCH p = SHORTEST 1 (:Station {name: 'Hartlebury'})--+(b:Station)


WHERE length(p) % 2 = 0
RETURN b.name AS destination, length(p) AS pathLength
ORDER BY pathLength, destination

Result

destination pathLength

"Bromsgrove" 2

"Worcester Foregate Street" 2

"Worcester Shrub Hill" 2

"Pershore" 4

Rows: 4

To move the predicate to a pre-filter, the path variable should be referenced from within the parentheses,
and the shortest routes with an even number of stops will be returned for all the destinations:

Query

MATCH SHORTEST 1
(p = (:Station {name: 'Hartlebury'})--+(b:Station)
WHERE length(p) % 2 = 0 )
RETURN b.name AS destination, length(p) AS pathLength
ORDER BY pathLength, destination

Result

destination pathLength

"Bromsgrove" 2

"Worcester Foregate Street" 2

"Worcester Shrub Hill" 2

"Ashchurch" 4

"Cheltenham Spa" 4

"Droitwich Spa" 4

"Pershore" 4

"Worcestershire Parkway" 4

Rows: 8

Planning shortest path queries


This section describes the operators used when planning shortest path queries. For readers not familiar
with Cypher execution plans and operators, it is recommended to first read the section Understanding
execution plans.

There are two operators used to plan SHORTEST queries:

257
• StatefulShortestPath(All) - uses a unidirectional breadth-first search algorithm to find shortest
paths from a previously matched start node to an end node that has not yet been matched.

• StatefulShortestPath(Into) - uses a bidirectional breadth-first search (BFS) algorithm, where two


simultaneous BFS invocations are performed, one from the left boundary node and one from the right
boundary node.

StatefulShortestPath(Into) is used by the planner when both boundary nodes in the shortest path are
estimated to match at most one node each. Otherwise, StatefulShortestPath(All) is used.

For example, the planner estimates that the left boundary node in the below query will match one node,
and the right boundary node will match five nodes, and chooses to expand from the left boundary node.
Using StatefulShortestPath(Into) would require five bidirectional breadth-first search (BFS) invocations,
whereas StatefulShortestPath(All) would require only one unidirectional BFS invocation. As a result, the
query will use StatefulShortestPath(All).

Query planned with StatefulShortestPath(All)

PROFILE
MATCH
p = SHORTEST 1 (a:Station {name: "Worcestershire Parkway"})(()-[]-()-[]-()){1,}(b:Station)
RETURN p

258
Result

+----------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+----------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | p
| 5 | 9 | 122 | 0 | 0/0 | 10.967 |
|
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +Projection | 1 | (a) ((anon_12)-[anon_14]-(anon_13)-[anon_11]-())* (b) AS p
| 5 | 9 | 0 | | 0/0 | 0.063 |
|
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +StatefulShortestPath(All) | 2 | SHORTEST 1 (a) ((`anon_5`)-[`anon_6`]-(`anon_7`)-[`anon_8`]-
(`anon_9`)){1, } (b) | 5 | 9 | 80 | 18927 | 0/0 |
1.071 | In Pipeline 1 |
| | | | expanding from: a
| | | | | | |
|
| | | | inlined predicates: b:Station
| | | | | | |
|
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +Filter | 3 | a.name = $autostring_0
| 1 | 1 | 18 | | | |
|
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeByLabelScan | 4 | a:Station
| 10 | 9 | 10 | 376 | 3/0 | 0.811 | Fused in
Pipeline 0 |
+----------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

However, the heuristic to favor StatefulShortestPath(All) can lead to worse query performance. To have
the planner choose the StatefulShortestPath(Into) instead, rewrite the query using a CALL subquery,
which will execute once for each incoming row.

For example, in the below query, using a CALL subquery ensures that the planner binds a and b to exactly
one Station node respectively for each executed row, and this forces it to use
StatefulShortestPath(Into) for each invocation of the CALL subquery, since a precondition of using this
operator is that both boundary nodes match exactly one node each.

The below query uses a variable scope clause (introduced in Neo4j 5.23) to import

 variables into the CALL subquery. If you are using an older version of Neo4j, use an
importing WITH clause instead.

259
Query rewritten to use StatefulShortestPath(Into)

PROFILE
MATCH
(a:Station {name: "Worcestershire Parkway"}),
(b:Station)
CALL (a, b) {
MATCH
p = SHORTEST 1 (a)(()-[]-()-[]-()){1,}(b)
RETURN p
}
RETURN p

Result

+-----------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | p
| 5 | 9 | 122 | 0 | 0/0 | 0.561 |
|
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +Projection | 1 | (a) ((anon_12)-[anon_14]-(anon_13)-[anon_11]-())* (b) AS p
| 5 | 9 | 0 | | 0/0 | 0.060 |
|
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +StatefulShortestPath(Into) | 2 | SHORTEST 1 (a) ((`anon_5`)-[`anon_6`]-(`anon_7`)-[`anon_8`]-
(`anon_9`)){1, } (b) | 5 | 9 | 176 | 17873 | 0/0 |
2.273 | In Pipeline 3 |
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +CartesianProduct | 3 |
| 5 | 9 | 0 | 2056 | 0/0 | 0.048 | In Pipeline 2
|
| |\ +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| | +NodeByLabelScan | 4 | b:Station
| 10 | 9 | 10 | 392 | 1/0 | 0.023 | In Pipeline 1
|
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +Filter | 5 | a.name = $autostring_0
| 1 | 1 | 18 | | | |
|
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeByLabelScan | 6 | a:Station
| 10 | 9 | 10 | 376 | 3/0 | 0.089 | Fused in
Pipeline 0 |
+-----------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Sometimes the planner cannot make reliable estimations about how many nodes a

 pattern node will match. Consider using a property uniqueness constraint where
applicable to help the planner get more reliable estimates.

260
Non-linear patterns
Cypher can be used to express non-linear patterns, either by equijoins (an operation in which more than
one node relationship in a path is the same) or by more complicated graph patterns consisting of multiple
path patterns.

Equijoins
An equijoin is an operation on paths that requires more than one of the nodes or relationships of the paths
to be the same. The equality between the nodes or relationships is specified by declaring the same variable
in multiple node patterns or relationship patterns. An equijoin allows cycles to be specified in a path
pattern.

This section uses the following graph:

departs: 12:03

S_AT
CALL
name: Birmingham Int'l

N
EX
T
departs: 11:33
LS_AT
name: Coventry CAL

CAL
LS_
AT
AT departs: 14:45
S_
ALL
C

arrives: 09:34
NE
XT
NEX
T

departs: 15:54
T
_A
LLS
CALLS CA
_AT
departs: 08:40

name: London Euston

To recreate the graph, run the following query against an empty Neo4j database:

261
CREATE (bhi:Station {name: 'Birmingham International'}),
(cov:Station {name: 'Coventry'}),
(eus:Station {name: 'London Euston'}),
(bhi)<-[:CALLS_AT]-(s1:Stop {departs: time('12:03')}),
(cov)<-[:CALLS_AT]-(s2:Stop {departs: time('11:33')})-[:NEXT]->(s1),
(eus)<-[:CALLS_AT]-(s3:Stop {departs: time('15:54')}),
(cov)<-[:CALLS_AT]-(s4:Stop {departs: time('14:45')})-[:NEXT]->(s3),
(cov)<-[:CALLS_AT]-(s5:Stop {departs: time('09:34')}),
(eus)<-[:CALLS_AT]-(s6:Stop {departs: time('08:40')})-[:NEXT]->(s5)

To illustrate how equijoins work, we will use the problem of finding a round trip between two Stations.

In this example scenario, a passenger starts their outbound journey at London Euston Station and ends at
Coventry Station. The return journey will be the reverse order of those Stations.

The graph has three different services, two of which would compose the desired round trip, and a third
which would send the passenger to Birmingham International.

The solution is the following path with a cycle:

If unique properties exist on the node where the cycle "join" occurs in the path, then it is possible to repeat
the node pattern with a predicate matching on the unique property. The following motif demonstrates how
that can be achieved, repeating a Station node pattern with the name London Euston:

CALLS_AT NEXT CALLS_AT CALLS_AT NEXT CALLS_AT


n
name: London Euston name: Coventry name: London Euston

262
The path pattern equivalent is:

(:Station {name: 'London Euston'})<-[:CALLS_AT]-(:Stop)-[:NEXT]->(:Stop)


-[:CALLS_AT]->(:Station {name: 'Coventry'})<-[:CALLS_AT]-(:Stop)
-[:NEXT]->(:Stop)-[:CALLS_AT]->(:Station {name: 'London Euston'})

There may be cases where a unique predicate is not available. In this case, an equijoin can be used to
define the desired cycle by using a repeated node variable. In the current example, if you declare the same
node variable n in both the first and last node patterns, then the node patterns must match the same node:

CALLS_AT NEXT CALLS_AT CALLS_AT NEXT CALLS_AT


n n
name: London Euston name: Coventry

Putting this path pattern with an equijoin in a query, the times of the outbound and return journeys can be
returned:

Query

MATCH (n:Station {name: 'London Euston'})<-[:CALLS_AT]-(s1:Stop)


-[:NEXT]->(s2:Stop)-[:CALLS_AT]->(:Station {name: 'Coventry'})
<-[:CALLS_AT]-(s3:Stop)-[:NEXT]->(s4:Stop)-[:CALLS_AT]->(n)
RETURN s1.departs+'-'+s2.departs AS outbound,
s3.departs+'-'+s4.departs AS `return`

Result

outbound return

"08:40:00Z-09:34:00Z" "14:45:00Z-15:54:00Z"

Rows: 1

Graph patterns
In addition to the single path patterns, multiple path patterns can be combined in a comma-separated list
to form a graph pattern. In a graph pattern, each path pattern is matched separately, and where node
variables are repeated in the separate path patterns, the solutions are reduced via equijoins. If there are no
equijoins between the path patterns, the result is a Cartesian product between the separate solutions.

The benefit of joining multiple path patterns in this way is that it allows the specification of more complex
patterns than the linear paths allowed by a single path pattern. To illustrate this, another example drawn
from the railway model will be used. In this example, a passenger is traveling from Starbeck to
Huddersfield, changing trains at Leeds. To get to Leeds from Starbeck, the passenger can take a direct
service that stops at all stations on the way. However, there is an opportunity to change at one of the
stations (Harrogate) on the way to Leeds, and catch an express train, which may enable the passenger to
catch an earlier train leaving from Leeds, reducing the overall journey time.

This section uses the following graph, showing the train services the passenger can use:

263
11:11
CALLS_AT
Starbeck Stop

NEXT
Station
Harrogate
CALLS_AT
11:17

NEXT
CALLS_AT

CALLS_AT

CALLS_AT
CALLS_AT
Hornbeam Park 11:20

NEXT
11:00 11:20 11:40 CALLS_AT
Pannal 11:25

NEXT
CALLS_AT
Weeton 11:30

NEXT
NEXT
NEXT

NEXT

CALLS_AT
Horsforth 11:39
NEXT

CALLS_AT
Headingley 11:43
NEXT

11:25 11:45 12:05


CALLS_AT
Burley Park 11:45
CALLS_AT
CALLS_AT
CALLS_AT

NEXT

CALLS_AT

Leeds 11:53
11:50 12:20
CALLS_AT CALLS_AT
NEXT

NEXT

Huddersfield
CALLS_AT CALLS_AT

12:07 12:37

To recreate the graph, run the following query against an empty Neo4j database:

264
CREATE (hgt:Station {name: 'Harrogate'}), (lds:Station {name: 'Leeds'}),
(sbe:Station {name: 'Starbeck'}), (hbp:Station {name: 'Hornbeam Park'}),
(wet:Station {name: 'Weeton'}), (hrs:Station {name: 'Horsforth'}),
(hdy:Station {name: 'Headingley'}), (buy:Station {name: 'Burley Park'}),
(pnl:Station {name: 'Pannal'}), (hud:Station {name: 'Huddersfield'}),
(s9:Stop {arrives: time('11:53')}),
(s8:Stop {arrives: time('11:44'), departs: time('11:45')}),
(s7:Stop {arrives: time('11:40'), departs: time('11:43')}),
(s6:Stop {arrives: time('11:38'), departs: time('11:39')}),
(s5:Stop {arrives: time('11:29'), departs: time('11:30')}),
(s4:Stop {arrives: time('11:24'), departs: time('11:25')}),
(s3:Stop {arrives: time('11:19'), departs: time('11:20')}),
(s2:Stop {arrives: time('11:16'), departs: time('11:17')}),
(s1:Stop {departs: time('11:11')}), (s21:Stop {arrives: time('11:25')}),
(s211:Stop {departs: time('11:00')}), (s10:Stop {arrives: time('11:45')}),
(s101:Stop {departs: time('11:20')}), (s11:Stop {arrives: time('12:05')}),
(s111:Stop {departs: time('11:40')}), (s12:Stop {arrives: time('12:07')}),
(s121:Stop {departs: time('11:50')}), (s13:Stop {arrives: time('12:37')}),
(s131:Stop {departs: time('12:20')}),
(lds)<-[:CALLS_AT]-(s9), (buy)<-[:CALLS_AT]-(s8)-[:NEXT]->(s9),
(hdy)<-[:CALLS_AT]-(s7)-[:NEXT]->(s8), (hrs)<-[:CALLS_AT]-(s6)-[:NEXT]->(s7),
(wet)<-[:CALLS_AT]-(s5)-[:NEXT]->(s6), (pnl)<-[:CALLS_AT]-(s4)-[:NEXT]->(s5),
(hbp)<-[:CALLS_AT]-(s3)-[:NEXT]->(s4), (hgt)<-[:CALLS_AT]-(s2)-[:NEXT]->(s3),
(sbe)<-[:CALLS_AT]-(s1)-[:NEXT]->(s2), (lds)<-[:CALLS_AT]-(s21), (hgt)<-[:CALLS_AT]-(s211)-[:NEXT]->(s21),
(lds)<-[:CALLS_AT]-(s10), (hgt)<-[:CALLS_AT]-(s101)-[:NEXT]->(s10), (lds)<-[:CALLS_AT]-(s11), (hgt)<-
[:CALLS_AT]-(s111)-[:NEXT]->(s11), (hud)<-[:CALLS_AT]-(s12), (lds)<-[:CALLS_AT]-(s121)-[:NEXT]->(s12),
(hud)<-[:CALLS_AT]-(s13), (lds)<-[:CALLS_AT]-(s131)-[:NEXT]->(s13)

The solution to the problem assembles a set of path patterns matching the following three parts: the
stopping service; the express service; and the final leg of the journey from Leeds to Huddersfield. Each
changeover, from stopping to express service and from express to onward service, has to respect the fact
that the arrival time of a previous leg has to be before the departure time of the next leg. This will be
encoded in a single WHERE clause.

The following visualizes the three legs with different colors, and identifies the node variables used to
create the equijoins and anchoring:

265
sbe a

l b

lds c

y hud

For the stopping service, it is assumed that the station the passenger needs to change at is unknown. As a
result, the pattern needs to match a variable number of stops before and after the Stop b, the Stop that
calls at the changeover station l. This is achieved by placing the quantified relationship -[:NEXT]->* on
each side of node b. The ends of the path also needs to be anchored at a Stop departing from Starbeck at
11:11, as well as at a Stop calling at Leeds:

(:Station {name: 'Starbeck'})<-[:CALLS_AT]-


(a:Stop {departs: time('11:11')})-[:NEXT]->*(b)-[:NEXT]->*
(c:Stop)-[:CALLS_AT]->(lds:Station {name: 'Leeds'})

For the express service, the ends of the path are anchored at the stop b and Leeds station, which lds will
be bound to by the first leg. Although in this particular case there are only two stops on the service, a more
general pattern that can match any number of stops is used:

(b)-[:CALLS_AT]->(l:Station)<-[:CALLS_AT]-(m:Stop)-[:NEXT]->*
(n:Stop)-[:CALLS_AT]->(lds)

266
Note that as Cypher only allows a relationship to be traversed once in a given match for a graph pattern,
the first and second legs are guaranteed to be different train services. (See relationship uniqueness for
more details.) Similarly, another quantified relationship that bridges the stops calling at Leeds station and
Huddersfield station is added:

(lds)<-[:CALLS_AT]-(x:Stop)-[:NEXT]->*(y:Stop)-[:CALLS_AT]->
(:Station {name: 'Huddersfield'})

The other node variables are for the WHERE clause or for returning data. Putting this together, the resulting
query returns the earliest arrival time achieved by switching to an express service:

Query

MATCH (:Station {name: 'Starbeck'})<-[:CALLS_AT]-


(a:Stop {departs: time('11:11')})-[:NEXT]->*(b)-[:NEXT]->*
(c:Stop)-[:CALLS_AT]->(lds:Station {name: 'Leeds'}),
(b)-[:CALLS_AT]->(l:Station)<-[:CALLS_AT]-(m:Stop)-[:NEXT]->*
(n:Stop)-[:CALLS_AT]->(lds),
(lds)<-[:CALLS_AT]-(x:Stop)-[:NEXT]->*(y:Stop)-[:CALLS_AT]->
(:Station {name: 'Huddersfield'})
WHERE b.arrives < m.departs AND n.arrives < x.departs
RETURN a.departs AS departs,
l.name AS changeAt,
m.departs AS changeDeparts,
y.arrives AS arrives
ORDER BY y.arrives LIMIT 1

Result

departs changeAt changeDeparts arrives

"11:11:00Z" "Harrogate" "11:20:00Z" "12:07:00Z"

Rows: 1

Syntax and semantics


This section contains reference material for looking up the syntax and semantics of specific elements of
graph pattern matching.

Node patterns
A node pattern is a pattern that matches a single node. It can be used on its own in a clause such as MATCH
or EXIST, or form part of a path pattern.

See also node pattern concepts.

Syntax

nodePattern ::= "(" [ nodeVariable ] [ labelExpression ]


[ propertyKeyValueExpression ] [ elementPatternWhereClause] ")"

elementPatternWhereClause ::= "WHERE" booleanExpression

For rules on valid node variable names, see the Cypher naming rules.

267
Rules

Predicates

Three types of predicate can be specified inside a node pattern:

• label expressions

• property key-value expressions

• WHERE clauses

The boolean expression of the WHERE clause can reference any variables within scope of the node pattern.
A node variable needs to be declared in the node pattern in order to reference it in the boolean expression.

If no predicates are specified, then the node pattern matches any node.

Variable binding

If a variable has not been declared elsewhere in the query, it will become bound to nodes when the
matching of its containing path pattern is executed. If it has been bound in a previous clause, then no new
nodes will be bound to the variable; any previously bound nodes that do not match in the current path
pattern will lead to the match being eliminated from the results. See the section on clause composition for
more details on the passing of results between clauses.

Examples
Matches all nodes with the label A and binds them to the variable n:

(n:A)

Matches all nodes with the label B and a property departs with the time value 11:11:

(:B { departs: time('11:11') })

Matches all nodes with the property departs with a value equal to the current time plus 30 minutes:

(n WHERE n.departs > time() + duration('PT30M'))

Relationship patterns
A relationship pattern is a pattern that matches a single relationship. It can only be used with node
patterns on either side of it.

A relationship pattern followed immediately by a quantifier is an abbreviated quantified path pattern called
a quantified relationship.

268
See also relationship pattern concepts.

Syntax

relationshipPattern ::= fullPattern | abbreviatedRelationship

fullPattern ::=
"<-[" patternFiller "]-"
| "-[" patternFiller "]->"
| "-[" patternFiller "]-"

abbreviatedRelationship ::= "<--" | "--" | "-->"

patternFiller ::= [ relationshipVariable ] [ typeExpression ]


[ propertyKeyValueExpression ] [ elementPatternWhereClause ]

elementPatternWhereClause ::= "WHERE" booleanExpression

Note that the syntax for type expressions in relationship patterns is the same as for label expressions in
node patterns (although unlike node labels, relationships must have exactly one type).

For rules on valid relationship variable names, see the Cypher naming rules.

Rules

Predicates

The following three types of predicate can be specified in the pattern filler of a full relationship pattern (i.e.
a pattern with the square brackets):

• label expressions

• property key-value expressions

• WHERE clauses

A fourth type of predicate specifies the directionality of the relationship with respect to the overall path
pattern, using the less-than or greater-than symbols to form arrows (< and >). If a relationship pattern has
no arrows, it will match relationships of any direction.

The boolean expression of the WHERE clause can reference any variables within scope of the relationship
pattern. A relationship variable needs to be declared in the pattern in order to reference it in the boolean
expression.

If no predicates are specified then the pattern matches all relationships.

Variable binding

If the variable has not been declared elsewhere in the query, it will become bound to relationships when
the matching of its containing path pattern is executed. If it has been bound in a previous clause, then no
new relationships will be bound to the variable; if any previously bound relationships do not match in the
current path pattern, then those matches will be eliminated from the results.

269
See the chapter on clause composition for more details on the passing of results between clauses.

Examples
Matches all relationships with the type R and binds them to the variable r:

()-[r:R]->()

Matches all relationships with type R and property distance equal to 100:

()-[:R {distance: 100}]->()

Matches all relationships where property distance is between 10 and 100:

()-[r WHERE 10 < r.distance < 100]->()

Matches all relationships that connect nodes with label A as their source and nodes with label B as their
target:

(:A)-->(:B)

Matches all relationships that connect nodes with label A and nodes with label B, irrespective of their
direction:

(:A)--(:B)

Label expressions
The following applies to both the label expressions of node patterns and the type expressions of
relationship patterns.

A label expression is a boolean predicate composed from label names and a wildcard symbol using
disjunction, conjunction, negation and grouping. A label expression returns true when it matches the set of
labels for a node.

Although relationships have a type rather than labels, the syntax for expressions matching a relationship
type is identical to that of label expressions.

Syntax

labelExpression ::= ":" labelTerm

labelTerm ::=
labelIdentifier
| labelTerm "&" labelTerm
| labelTerm "|" labelTerm
| "!" labelTerm
| "%"
| "(" labelTerm ")"

270
For valid label identifiers, see the Cypher naming rules.

Rules
The following table lists the symbols used in label expressions:

Symbol Description Precedence

% Wildcard. Evaluates to true if the label set is non-empty

() Contained expression is evaluated before evaluating the outer 1 (highest)


expression the group is contained in.

! Negation 2

& Conjunction 3

| Disjunction 4 (lowest)

Associativity is left-to-right.

Examples
In the following table, a tick is shown where the label expression matches the node with the labels shown:

Node

Node () (:A) (:B) (:C) (:A:B) (:A:C) (:B:C) (:A:B:C)


pattern

() ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅

(:A) ✅ ✅ ✅ ✅

(:A&B) ✅ ✅

(:A|B) ✅ ✅ ✅ ✅ ✅ ✅

(:!A) ✅ ✅ ✅ ✅

(:!!A) ✅ ✅ ✅ ✅

(:A&!A)

(:A|!A) ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅

(:%) ✅ ✅ ✅ ✅ ✅ ✅ ✅

(:!%) ✅

(:%|!%) ✅ ✅ ✅ ✅ ✅ ✅ ✅ ✅

(:%&!%)

271
(:A&%) ✅ ✅ ✅ ✅

(:A|%) ✅ ✅ ✅ ✅ ✅ ✅ ✅

(:(A&B)&!(B ✅
&C))

(:!A&%) ✅ ✅ ✅

As relationships have exactly one type each, this expression will never match a relationship:

-[:A&B]->

Similarly, the following will always match a relationship:

-[:%]->

The use of negation can make the conjunction useful in relationship patterns. The following matches
relationships that have a type that is neither A nor B:

-[:!A&!B]->

Property key-value expressions

Syntax

propertyKeyValueExpression ::=
"{" propertyKeyValuePairList "}"

propertyKeyValuePairList ::=
propertyKeyValuePair [ "," propertyKeyValuePair ]

propertyKeyValuePair ::= propertyName ":" valueExpression

Rules
The property key-value expression is treated as a conjunction of equalities on the properties of the element
that the containing pattern matches.

For example, the following node pattern:

({ p: valueExp1, q: valueExp2 })

is equivalent to the following node pattern with a WHERE clause:

(n WHERE n.p = valueExp1 AND n.q = valueExp2)

The value expression can be any expression as listed in the section on expressions, except for path
patterns (which will throw a syntax error) and regular expressions (which will be treated as string literals).
An empty property key-value expression matches all elements. Property key-value expressions can be

272
combined with a WHERE clause.

Examples
Matches all nodes with property p = 10:

({ p: 10 })

Matches all relationships with property p = 10 and q equal to date 2023-02-10:

()-[{ p: 10, q: date('2023-02-10') }]-()

Matches all relationships with its property p equal to the property p of its source node:

(s)-[{ p: s.p }]-()

Matches all nodes with property p = 10 and property q greater than 100:

(n { p: 10 } WHERE n.q > 100)

Path patterns
A path pattern is the top level pattern that is matched against paths in a graph.

Syntax

pathVariableDeclaration ::= pathVariable "="

pathPatternExpression ::=
{ parenthesizedPathPatternExpression | pathPatternPhrase }

parenthesizedPathPatternExpression ::=
"("
[ subpathVariableDeclaration ]
pathPatternExpression
[ parenthesizedPathPatternWhereClause ]
")"

subpathVariableDeclaration ::= pathVariable "="

pathPatternPhrase ::= [{ simplePathPattern | quantifiedPathPattern }]+

simplePathPattern ::= nodePattern


[ { relationshipPattern | quantifiedRelationship } nodePattern ]*

parenthesizedPathPatternWhereClause ::= "WHERE" booleanExpression

See the related section for each of the syntax rules:

shortestPathSelector Shortest paths

273
quantifiedPathPattern Quantified path patterns

nodePattern Node patterns

relationshipPattern Relationship patterns

quantifiedRelationship Quantified relationships

Rules
The minimum number of elements in the path pattern must be greater than zero. For example, a path
pattern that is a quantified path pattern with a quantifier that has a lower bound of zero is not allowed:

Not allowed

((n)-[r]->(m)){0,10}

A path pattern must always begin and end with a node pattern.

Not allowed

(n)-[r]->(m)-[s]-

A path pattern may be composed of a concatenation of simple and quantified path patterns. Two simple
path patterns, however, may not be placed next to each other.

Not allowed

(a)<-[s]-(b) (c)-[t]->(d)

When a path pattern is matched to paths in a graph, nodes can be revisited but relationships cannot.

A subpath variable (a path variable declared inside a parenthesized path pattern expression) can only be
used if a shortest path mode is also specified.

Allowed

MATCH SHORTEST 1 (p = (a)-[r]->+(b) WHERE length(p) > 3)

Not allowed

MATCH (:A)-[:S]->(:B) (p = ((a)-[r]->(b))+)

See graph patterns for rules on declaring variables multiple times.

274
Examples
A single node pattern is allowed as it has at least one element:

(n)

A simple path pattern with more than one element:

(a:A)<-[{p: 30}]-(b)-[t WHERE t.q > 0]->(c:C)

A quantified path pattern can have a lower bound of zero in its quantifier as long as it abuts other patterns
that have at least one element:

(:A) ((:X)-[:R]-()){0,10} (:B)

A quantified relationship can also have a lower bound of zero as long as the overall path pattern has at
least one element:

(:A)-[:R]->{0,10}(:B)

A concatenation of simple and quantified path patterns:

(a)<-[s]-(b)-[t]->(c) ((n)-[r]->(m)){0,10} (:X)

Referencing non-local node variable in a simple path pattern:

(a)<-[s:X WHERE a.p = s.p]-(b)

Referencing a non-local relationship variable within a quantified path pattern:

(:A) ((a)<-[s:X WHERE a.p = s.p]-(b)){,5}

A variable that was introduced in a previous clause can be referenced as long as that variable was defined
outside of a quantified path pattern:

MATCH (n)
MATCH ()-[r WHERE r.q = n.q]-() (()<-[s:X WHERE n.p = s.p]-()){2,3}

Paths matched by the path pattern can be assigned to a variable:

MATCH p = ()-[r WHERE r.q = n.q]-()

275
Quantified path patterns Label—new 5.9
A quantified path pattern represents a path pattern repeated a number of times in a given range. It is
composed of a path pattern, representing the path section to be repeated, followed by a quantifier,
constraining the number of repetitions between a lower bound and an upper bound.

For information about an alternative version of patterns for matching paths of variable length, see variable-
length relationships.

Syntax

quantifiedPathPattern ::=
parenthesizedPathPatternExpression quantifier

fixedPath ::= nodePattern [ relationshipPattern nodePattern ]+

Rules

Minimum pattern length

The path pattern being quantified must have a length greater than zero. In other words, it must contain at
least one relationship. A single node pattern cannot be quantified.

Not allowed

((x:A)){2,4}

Nesting of quantified path patterns

The nesting of quantified path patterns is not allowed. For example, the following nesting of a quantified
relationship in a quantified path pattern is not allowed:

Not allowed

(:A) (()-[:R]->+()){2,3} (:B)

A quantified path pattern that is part of the boolean expression within a quantified path pattern would not

276
count as nested and is permitted.

Allowed

MATCH ((n:A)-[:R]->({p: 30}) WHERE EXISTS { (n)-->+(:X) }){2,3}

Group variables

Variables introduced inside of a quantified path pattern are said to be exposed as group variables outside
of the definition of the pattern. As a group variable, they will be bound to either a list of nodes or a list of
relationships. By contrast, variables can be treated as singletons inside the quantified path pattern where
they are declared. The difference can be seen in the following query:

MATCH ((x)-[r]->(z WHERE z.p > x.p)){2,3}


RETURN [n in x | n.p] AS x_p

In the boolean expression z.p > x.p both z and x are singletons; in the RETURN clause, x is a group variable
that can be iterated over like a list. Note that this means that the WHERE clause z.p > x.p above needs to be
inside the quantified path pattern. The following would throw a syntax error because it is treating z and p
as singletons:

Not allowed

MATCH ((x)-[r]->(z)){2,3} WHERE z.p > x.p

It is possible, however, to position the WHERE clause outside of the node pattern:

Allowed

MATCH ((x)-[r]->(z) WHERE z.p > x.p){2,3}

Matching

The mechanics of matching a quantified path pattern against paths is best explained with an example. For
the example, the following simple graph will be used:

277
First, consider the following simple path pattern:

(x:A)-[:R]->(z:B WHERE z.h > 2)

This matches three different paths in the graph above. The resulting bindings for x and z for each match
are the following (the captions n1 etc indicate the identity of the nodes in the diagram above):

x z

n1 n2

n2 n3

n3 n5

If the quantifier {2} is affixed to the simple path pattern, the result is the following quantified path pattern:

((x:A)-[:R]->(z:B WHERE z.h > 2)){2}

This is equivalent to chaining together two iterations of the pattern, where the rightmost node of the first
iteration is merged with the leftmost node of the second one. (See node pattern pairs for more details.)

(x:A)-[:R]->(z:B WHERE z.h > 2) (x:A)-[:R]->(z:B WHERE z.h > 2)

To avoid introducing equijoins between the two instances of x, and between the two instances of z, the
variables are replaced with a set of fresh variables inside each iteration:

(x1:A)-[:R]->(z1:B WHERE z1.h > 2) (x2:A)-[:R]->(z2:B WHERE z2.h > 2)

Then the node variables in adjoining node patterns are merged:

(x1:A)-[:R]->({z1,x2}:A&B WHERE z1.h > 2)-[:R]->(z2:B WHERE z2.h > 2)

278
The fact that variables x2 and z1 are bound to matches of the same node pattern is represented with the
notation {z1,x2}. Outside of the pattern, the variables x and z will be group variables that contain lists of
nodes.

Consider the quantified path pattern in the following query:

MATCH ((x:A)-[:R]->(z:B WHERE z.h > 2)){2}


RETURN [n in x | n.h] AS x_h, [n in z | n.h] AS z_h

This yields the following results:

x_h z_h

[1, 3] [3, 4]

[3, 4] [4, 5]

Now the quantifier is changed to match lengths from one to five:

MATCH ((x:A)-[:R]->(z:B WHERE z.h > 2)){1,5}


RETURN [n in x | n.h] AS x_h, [n in z | n.h] AS z_h

Compared to the fixed length quantifier {2}, this also matches paths of length one and three, but no
matches exist for length greater than three:

x_h z_h

[1] [3]

[3] [4]

[4] [5]

[1, 3] [3, 4]

[3, 4] [4, 5]

[1, 3, 4] [3, 4, 5]

Quantified relationships Label—new 5.9

Syntax

quantifiedRelationship ::= relationshipPattern quantifier

Rules
A quantified relationship is an abbreviated form of a quantified path pattern, with only a single relationship
pattern specified.

For example, the following quantified relationship:

(:A)-[r:R]->{m,n}(:B)

279
It matches paths with between m and n relationships of type R pointing left-to-right, starting at a node of
type A, ending at a node of type B, and matching any nodes in-between. It is equivalent to the following
quantified path pattern:

(:A) (()-[r:R]->()){m,n} (:B)

With the expanded form of a quantified path pattern, it is possible to add predicates inside. Conversely, if
the only predicate is a relationship type expression, query syntax can be more concise using a quantified
relationship.

Note that, unlike a quantified path pattern, a quantified relationship must always have a node pattern on
each side.

Examples
Matches one or more relationships with type R and of any direction, and any nodes connecting those
relationships:

()-[:R]-+()

Matches paths consisting of two inbound subpaths, one with relationships of type R and one with
relationships of type S, meeting at a node with label A:

()-[:R]->+(:A)<-[:S]-+()

Matches paths with any nodes and with one or more relationships of any direction and any type:

()--+()

Matches paths starting with nodes labeled A and ending with nodes labeled B, that traverse between two
and three relationships of type R, matching any intermediate nodes:

(:A)-[r:R]->{2,3}(:B)

Quantifiers
The quantifiers here only refer to those used in quantified path patterns and quantified relationships.

Syntax

280
quantifier ::=
"*" | "+" | fixedQuantifier | generalQuantifier

fixedQuantifier ::= "{" unsignedDecimalInteger "}"

generalQuantifier ::= "{" lowerBound "," upperBound "}"

lowerBound ::= unsignedDecimalInteger

upperBound ::= unsignedDecimalInteger

unsignedDecimalInteger ::= [0-9]+

The unsignedDecimalInteger must not be larger than the Java constant Long.MAX_VALUE
 63
(2 -1).

Rules
The absence of an upper bound in the general quantifier syntax means there is no upper bound. The
following table shows variants of the quantifier syntax and their canonical form:

Variant Canonical form Description

{m,n} {m,n} Between m and n iterations.

+ {1,} 1 or more iterations.

* {0,} 0 or more iterations.

{n} {n,n} Exactly n iterations.

{m,} {m,} m or more iterations.

{,n} {0,n} Between 0 and n iterations.

{,} {0,} 0 or more iterations.

Note that a quantified path pattern with the quantifier {1} is not equivalent to a fixed-length path pattern.
Although the resulting quantified path pattern will match on the same paths the fixed-length path
contained in it would without the quantifier, the presence of the quantifier means that all variables within
the path pattern will be exposed as group variables.

Variable-length relationships
Prior to the introduction of the syntax for quantified path patterns and quantified relationships in Neo4j
5.9, the only way in Cypher to match paths of variable length was with a variable-length relationship. This
syntax is still available, but it is not GQL conformant. It is equivalent to the syntax for quantified
relationships, with the following differences:

281
• Position and syntax of quantifier.

• Semantics of the asterisk symbol.

• Type expressions are limited to the disjunction operator.

• The WHERE clause is not allowed.

Syntax

varLengthRelationship ::=
"<-[" varLengthFiller "]-"
| "-[" varLengthFiller "]->"
| "-[" varLengthFiller "]-"

varLengthFiller ::= [ relationshipVariable ] [ varLengthTypeExpression ]


[ varLengthQuantifier ] [ propertyKeyValueExpression ]

varLengthTypeExpression ::= ":" varLengthTypeTerm

varLengthTypeTerm ::=
typeIdentifier
| varLengthTypeTerm "|" varLengthTypeTerm

varLengthQuantifier ::= varLengthVariable | varLengthFixed

varLengthVariable ::= "*" [ [ lowerBound ] ".." [ upperBound ] ]

varLengthFixed ::= "*" fixedBound

fixedBound ::= unsignedDecimalInteger

lowerBound ::= unsignedDecimalInteger

upperBound ::= unsignedDecimalInteger

unsignedDecimalInteger ::= [0-9]+

The unsignedDecimalInteger must not be larger than the Java constant Long.MAX_VALUE
 63
(2 -1).

For rules on valid relationship variable names, see Cypher naming rules.

Rules
The following table shows variants of the variable-length quantifier syntax and their equivalent quantifier
form (the form used by quantified path patterns):

Variant Description Quantified relationship Quantified path pattern


equivalent equivalent

-[*]-> 1 or more iterations. -[]->+ (()-[]->())+

-[*n]-> Exactly n iterations. -[]->{n} (()-[]->()){n}

-[*m..n]-> Between m and n -[]->{m,n} (()-[]->()){m,n}

iterations.

282
Variant Description Quantified relationship Quantified path pattern
equivalent equivalent

-[*m..]-> m or more iterations. -[]->{m,} (()-[]->()){m,}

-[*0..]-> 0 or more iterations. -[]->* (()-[]->())*

-[*..n]-> Between 1 and n -[]->{1,n} (()-[]->()){1,n}

iterations.

-[*0..n]-> Between 0 and n -[]->{,n} (()-[]->()){,n}

iterations.

Note that * used here on its own is not the same as the Kleene star (an operator that represents zero or
more repetitions), as the Kleene star has a lower bound of zero. The above table can be used to translate
the quantifier used in variable-length relationships. The rules given for quantified path patterns would
apply to the translation.

This table shows some examples:

Variable-length relationship Equivalent quantified path pattern

(a)-[*2]->(b) (a) (()-[]->()){2,2} (b)

(a)-[:KNOWS*3..5]->(b) (a) (()-[:KNOWS]->()){3,5} (b)

(a)-[r*..5 {name: 'Filipa'}]->(b) (a) (()-[r {name: 'Filipa'}]->()){1,5} (b)

Equijoins
The variable of a variable-length relationship can be used in subsequent patterns to refer to the list of
relationships the variable is bound to. This is the same as the equijoin for variables bound to single nodes
or relationships.

This section uses the following graph:

To recreate the graph, run the following query against an empty Neo4j database:

Query

CREATE ({name: 'Filipa'})-[:KNOWS]->({name:'Anders'})-[:KNOWS]->


({name:'Dilshad'})

In the following query, the node variables will be bound to the same nodes:

283
Query

MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)


MATCH (c)<-[r*1..2]-(d)
RETURN a = c, b = d, size(r)

Result

a=c b=d size(r)

true true 1

true true 2

Rows: 2

The list of relationships keeps its order. This means that in the following query, where the direction of the
variable-length relationship in the second MATCH is switched, the equijoin will only match once, when there
is a single relationship:

Query

MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)


MATCH (c)-[r*1..2]->(d)
RETURN a = c, b = d, size(r)

Result

a=c b=d size(r)

false false 1

Rows: 1

The variable r can be reversed in order like any list, and made to match the switch in relationship pattern
direction:

Query

MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)


WITH a, b, reverse(r) AS s
MATCH (c)-[s*1..2]->(d)
RETURN a = d, b = c, size(s)

Result

a=d b=c size()

true true 1

true true 2

Rows: 2

Changing the bounds on subsequent MATCH statements will mean that only the overlapping lengths of the
quantifier bounds will produce results:

284
Query

MATCH (a {name: 'Dilshad'})<-[r*1..2]-(b)


MATCH (c)<-[r*2..3]-(d)
RETURN a = c, b = d, size(r)

Result

a=c b=d size(r)

true true 2

Rows: 1

Because Cypher only allows paths to traverse a relationship once (see relationship uniqueness), repeating
a variable-length relationship in the same graph pattern will yield no results. For example, this MATCH clause
will never pass on any intermediate results to subsequent clauses:

MATCH (x)-[r*1..2]->(y)-[r*1..2]->(z)

Attempting to repeat a variable-length relationship in a single relationship pattern will raise an error. For
example, the following pattern raises an error because the variable r appears in both a variable-length
relationship and a fixed-length relationship:

MATCH (x)-[r*1..2]->(y)-[r]->(z)

Examples
The following pattern matches paths starting with nodes labeled A and ending with nodes labeled B, that
traverse between two and three relationships of type R:

(:A)-[r:R*2..3]->(:B)

The following traverses relationships of type R or S or T exactly five times:

()-[r:R|S|T*5]->()

The following traverses any relationship between zero and five times, with the path beginning at nodes
labeled A and ending at nodes labeled B. Note that this will also return all nodes that have both labels A and
B for the case where zero relationships are traversed:

(:A)-[*0..5]->(:B)

If the lower bound is removed, it will default to one, and will no longer return paths of length zero, i.e.
single nodes:

(:A)-[*..5]->(:B)

The following pattern traverses one or more relationships of any direction that have property p = $param:

285
()-[* {p: $param}]-()

Shortest paths
Path selectors are modes that are specified at the path pattern level. There are five types of path selectors:

• SHORTEST k

• ALL SHORTEST

• SHORTEST k GROUPS

• ANY k

• ALL

The first three are shortest path selectors, and they specify which shortest paths should be returned, with
the exact method of selection depending on the path selector type and the value of k.

See <<, path patterns>> for details on where in the path pattern to include the path selector.

Syntax

shortestPathSelector ::=
{ allPathSearch | anyPathSearch | shortestPathSearch }

shortestPathSearch ::=
{ allShortest | anyShortest | countedShortestPaths |
countedShortestGroups }

allShortest ::= "ALL SHORTEST" [ pathOrPaths ]

anyShortest ::= "ANY SHORTEST" [ pathOrPaths ]

countedShortestPaths ::=
"SHORTEST" numberOfPaths [ pathOrPaths ]

countedShortestGroups ::=
"SHORTEST" [ numberOfGroups ] [ pathOrPaths ] { "GROUP" | "GROUPS" }

allPathSearch ::= "ALL" [ pathOrPaths ]

anyPathSearch ::= "ANY" [ numberOfPaths ] [ pathOrPaths ]

pathOrPaths ::= { "PATH" | "PATHS" }

numberOfPaths ::= unsignedDecimalInteger

numberOfGroups ::= unsignedDecimalInteger

The unsignedDecimalInteger must not be larger than the Java constant Long.MAX_VALUE
 63
(2 -1).

286
Rules

Selective path selectors

The following path selector types are selective path selectors:

• SHORTEST k

• SHORTEST k GROUPS

• ALL SHORTEST

• ANY k

This means that they reduce the number of matches produced by the path pattern. ALL returns all matches
produced, and so is not selective.

Order of path selection

When a path selector is specified, the following steps are followed to find solutions to the path pattern:

1. All paths matching the path pattern are found.

2. Any paths not satisfying the predicates contained in the path pattern are removed. This is a pre-filter.

3. Paths are selected according to the path selector specified.

4. Any paths not satisfying the predicates in the graph pattern WHERE clause are removed. This is a post
-filter.

Examples of pre-filters

// node pattern WHERE clause


MATCH p = SHORTEST 2 (a WHERE a.p < 42)--+()

// relationship pattern property key-value expression


MATCH p = SHORTEST 2 (a)-[{p: 42}]-+()

// label and type expressions


MATCH p = SHORTEST 2 (:A)-[:R]-+(:B)

// quantified path pattern


MATCH SHORTEST 2 ((a)--(b) WHERE a.p < b.p)+

// parenthesized path pattern expression


// note the position of the parentheses!
MATCH SHORTEST 2 ( p = ()--+() WHERE any(n IN nodes(p) WHERE n.p < 42) )

Examples of post-filters

// graph pattern WHERE clause


MATCH p = SHORTEST 2 (a)--+()
WHERE all(n IN nodes(p) WHERE n.p < 42)

287
Partitions

If there are multiple start and end nodes matching a path pattern, then, when a selective path selector is
specified, the paths matched by the path pattern are partitioned by distinct pairs of start and end nodes.
Within each partition, they are put in ascending order of path length. The partitioned paths are then
selected according to the specified path selector as follows:

Selective path selector Partitioning

SHORTEST k First k paths, starting with the shortest. Where more than one path could be
picked for a given length, there is no particular order in which that selection is
done. If k is greater than the number of paths in the partition, then all of the
paths in that partition are returned.

ANY SHORTEST is equivalent to SHORTEST 1.

SHORTEST k GROUPS The paths are further grouped by path length, and those groups are put in
ascending order by path length. Then the paths from the first k groups are
selected. If k is greater than the number of path length groups within a
partition, then all paths in that partition are returned. GROUP and GROUPS can be
used interchangeably.

ALL SHORTEST All paths tied for the shortest. Equivalent to SHORTEST 1 GROUP.

ANY k Any k paths are returned. ANY is the same as ANY 1. This is useful to determine
the reachability of nodes.

ALL All paths are returned. This is the same as not specifying any path selector.

For a selective path selector, if k is greater than N, the number of paths matching the path pattern, then all
N paths are returned.

Only one path pattern allowed

When a selective shortest path selector is specified for a path pattern, it must be the only path pattern in
the graph pattern.

Not allowed

MATCH p = SHORTEST 2 (:A)--+(a)--+(:B), q = ANY 2 (a)-->{,2}(:C)


RETURN p, q

MATCH ALL SHORTEST (n:A) (()-->(:B))+, (:X)--(n)--(:Y)


RETURN n

288
Allowed

MATCH p = SHORTEST 2 (:A)--+(a)--+(:B)


MATCH q = ANY 2 (a)-->{,2}(:C)
RETURN p, q

MATCH ALL SHORTEST (n:A) (()-->(:B))+


MATCH (:X)--(n)--(:Y)
RETURN n

MATCH ALL (n:A) (()-->(:B))+, (:X)--(n)--(:Y)


RETURN n

PATH/PATHS

The keywords PATH and PATHS are optional and can be added to the end of the path selector (but before
GROUP or GROUPS). Including either of them does not change path selection. For example, the following two
MATCH clauses are equivalent:

MATCH ALL SHORTEST PATHS (n:A) (()-->(:B))+


MATCH ALL SHORTEST (n:A) (()-->(:B))+

Examples
Return a single shortest path for each distinct pair of nodes matching (:A) and (:B):

MATCH SHORTEST 1 (:A)-[:R]->{0,10}(:B)

Return any two paths connecting each distinct pair of nodes matching (:A) and (:B):

MATCH p = ANY 2 (:A)-[:R]->{0,10}(:B)

Return all paths equal to the shortest path length for each distinct pair of nodes matching (:A) and (:B):

MATCH ALL SHORTEST (:A)-[:R]->{0,10}(:B)

Return all paths equal to the two shortest path lengths for each distinct pair of nodes matching (:A) and
(:B):

MATCH SHORTEST 2 GROUPS (:A)-[:R]->{0,10}(:B)

Return a single shortest path for each distinct pair of nodes, where the path length is an even number:

MATCH SHORTEST 1 (p = ()--+() WHERE length(p) % 2 = 0)

For every single shortest path connecting each distinct pair of nodes, only return those that have path
length that is an even number (so fewer results than the previous example):

289
MATCH p = SHORTEST 2 ()--+()
WHERE length(p) % 2 = 0

The shortestPath() and allShortestPaths() functions


Prior to the introduction of keyword-based specification of shortest path selection in Neo4j 5.21, the only
available syntax for shortest paths were the functions shortestPath() and allShortestPaths(). They are
similar to SHORTEST 1 and ALL SHORTEST, but with several differences:

• The path pattern is passed as an argument to the functions.

• The path pattern is limited to a single relationship pattern.

• To return results where the first and last node in the path are the same requires a change to the
configuration setting dbms.cypher.forbid_shortestpath_common_nodes.

Both functions will continue to be available, but they are not GQL conformant.

Syntax

pathSelectorFunction ::=
{ shortestPathFunction | allShortestPathsFunction }

shortestPathFunction ::=
"shortestPath(" + oneHopPathPatternExpression + ")"

allShortestPathsFunction ::=
"allShortestPaths(" + oneRelPathPatternExpression + ")"

oneRelPathPatternExpression ::=
nodePattern varLengthRelationship nodePattern

Note that it is possible to pass a fixed length path pattern (with a single relationship) to the path selector
function, but doing so would not serve any purpose in discovering a shortest path.

Rules

Restricted to variable length

The pattern in the path selector function must be a variable-length relationship and not a quantified path
pattern.

Not allowed

shortestPath(((a)-[:R]-(b)){1,5})

shortestPath((:A)-->+(:B))

Path pattern length

There must be exactly one relationship pattern in the path pattern.

290
Allowed

shortestPath((a)-[:R*1..5]-(b))

Not allowed

shortestPath((a)-[:R*1..5]-(b)-->(:X))

shortestPath((:A))

allShortestPaths((a:A)-[:S*]->(:B), (a)-[:R*1..3]->(:C))

Pre and post filtering

If the MATCH clause of the shortestPath() function includes a WHERE clause, this condition will act as a pre-
filter: paths satisfying the WHERE clause are first found, and from those paths the shortest path is selected.

Examples of pre-filters:

MATCH p = shortestPath(()-[*]-())
WHERE all(n in nodes(p) WHERE n.p < 42)

MATCH p = shortestPath(()-[*]-())
WHERE all(r in relationships(p) WHERE r.p < 42)

These contrast with WHERE clauses that are not part of the same MATCH clause.

Example of post-filters

MATCH p = shortestPath(()-[*]-())
WITH nodes(p) AS N
WHERE all(n in N WHERE n.p < 42)

Non-determinism

If there is more than one path with minimum length, then shortestPath() function will return one of those
paths non-deterministically. As allShortestPaths() would return all of those paths, its results are
deterministic.

Examples
Return a single shortest path for each distinct pair of nodes matching (:A) and (:B):

MATCH shortestPath((:A)-[:R]->{0,10}(:B))

Return all paths equal to the shortest path length for each distinct pair of nodes matching (:A) and (:B):

MATCH allShortestPaths((:A)-[:R]->{0,10}(:B))

291
Graph patterns
A graph pattern is a comma separated list of one or more path patterns. It is the top level construct
provided to MATCH.

Syntax

graphPattern ::=
pathPattern [ "," pathPattern ]* [ graphPatternWhereClause ]

graphPatternWhereClause ::= "WHERE" booleanExpression

Rules
The rules for path patterns apply to each constituent path pattern of a graph pattern.

Variable references

If a variable is declared inside a quantified path pattern, then it can be treated as a singleton only from
within the quantified path pattern it was declared in. Outside of that quantified path pattern, it must be
treated as a group variable.

Allowed

((n)-[r]->(m WHERE r.p = m.q))+

Allowed

(n)-[r]->+(m WHERE all(rel in r WHERE rel.q > m.q))

Not allowed

(n)-[r]->+(m WHERE r.p = m.q)

Relationship uniqueness

A relationship can only be traversed once in a given match for a graph pattern. The same restriction
doesn’t hold for nodes, which may be re-traversed any number of times in a match.

292
Equijoin

If a node variable is declared more than once in a path pattern, it is expressing an equijoin. This is an
operation that requires that each node pattern with the same node variable be bound to the same node.
For example, the following pattern refers to the same node twice with the variable a, forming a cycle:

(a)-->(b)-->(c)-->(a)

The following pattern refers to the same node with variable b in different path patterns of the same graph
pattern, forming a "T" shaped pattern:

(a)-->(b)-->(c), (b)-->(e)

Equijoins can only be made using variables outside of quantified path patterns. The following would not be
a valid equijoin:

Not allowed

(a)-->(b)-->(c), ((b)-->(e))+ (:X)

If no equijoin exists between path patterns in a graph pattern, then a Cartesian join is formed from the sets
of matches for each path pattern. An equijoin can be expressed between relationship patterns by declaring
a relationship variable multiple times. However, as relationships can only be traversed once in a given
match, no solutions would be returned.

Examples
The WHERE clause can refer to variables inside and outside of quantified path patterns:

(a)-->(b)-->(c), (b) ((d)-->(e))+ WHERE any(n in d WHERE n.p = a.p)

An equijoin can be formed to match "H" shaped graphs:

(:A)-->(x)--(:B), (x)-[:R]->+(y), (:C)-->(y)-->(:D)

With no variables in common, this graph pattern will result in a Cartesian join between the sets of matches
for the two path patterns:

(a)-->(b)-->(c), ((d)-->(e))+

Multiple equijoins can be formed between path patterns:

(:X)-->(a:A)-[!:R]->+(b:B)-->(:Y), (a)-[:R]->+(b)

Variables declared in a previous MATCH can be referenced inside of a quantified path pattern:

293
MATCH (n {p = 'ABC'})
MATCH (n)-->(m:A)-->(:B), (m) (()-[r WHERE r.p <> n.p]->())+ (:C)

The repetition of a relationship variable in the following yields no solutions due to Cypher enforcing
relationship uniqueness within a match for a graph pattern:

MATCH ()-[r]->()-->(), ()-[r]-()

Node pattern pairs


It is not valid syntax to write a pair of node patterns next to each other. For example, all of the following
would raise a syntax error:

(a:A)(b:B)

(a:A)(b:B)<-[r:R]-(c:C)

(a:A)<--(b:B)(c:C)-->(d:C)

However, the placing of pairs of node patterns next to each other is valid where it results indirectly from
the expansion of quantified path patterns.

Iterations of quantified path patterns


When a quantified path pattern is expanded, the fixed path pattern contained in its parentheses is
repeated and chained. This results in pairs of node patterns sitting next to each other. Take the following
quantified path pattern as an example:

((x:X)<--(y:Y)){3}

This is expanded by repeating the fixed path pattern (x:X)←-(y:Y) three times, with indices on the
variables to show that no equijoin is implied (see equijoins for more information):

(x1:X)<--(y1:Y)(x2:X)<--(y2:Y)(x3:X)<--(y3:Y)

The result is that two pairs of node patterns end up adjoining each other, (y1:Y)(x2:X) and (y2:Y)(x3:X).
During the matching process, each pair of node patterns will match the same nodes, and those nodes will
satisfy the conjunction of the predicates in the node patterns. For example, in the first pair both y1 and x2
will bind to the same node, and that node must have labels X and Y. This expansion and binding is depicted
in the following diagram:

294
Simple path patterns and quantified path patterns
Pairs of node patterns are also generated when a simple path pattern is placed next to a quantified path.
For example, consider the following path pattern:

(:A)-[:R]->(:B) ((:X)<--(:Y)){1,2}

After expanding the iterations of the quantified path pattern, the right-hand node pattern (:B) adjoins the
left-hand node pattern (:X). The result will match the same paths as the union of matches of the following
two path patterns:

(:A)-[:R]->(:B&X)<--(:Y)

(:A)-[:R]->(:B&X)<--(:Y&X)<--(:Y)

If the simple path pattern is on the right of the quantified path pattern, its leftmost node (:A) adjoins the
rightmost node (:Y) of the last iteration of the quantified path pattern. For example, the following:

((:X)<--(:Y)){1,2} (:A)-[:R]->(:B)

will match the same paths as the union of the following two path patterns:

(:X)<--(:Y&A)-[:R]->(:B)

(:X)<--(:Y&X)<--(:Y&A)-[:R]->(:B)

295
Pairs of quantified path patterns
When two quantified path patterns adjoin, the rightmost node of the last iteration of the first pattern is
merged with the leftmost node of the first iteration of the second pattern. For example, the following
adjoining patterns:

((:A)-[:R]->(:B)){2} ((:X)<--(:Y)){1,2}

will match the same set of paths as the union of the paths matched by these two path patterns:

(:A)-[:R]->(:B&A)-[:R]->(:B&X)<--(:Y)

(:A)-[:R]->(:B&A)-[:R]->(:B&X)<--(:Y&X)<--(:Y)

Zero iterations
If the quantifier allows for zero iterations of a pattern, for example {0,3}, then the 0th iteration of that
pattern results in the node patterns on either side pairing up.

For example, the following path pattern:

(:X) ((a:A)-[:R]->(b:B)){0,1} (:Y)

will match the same set of paths as the union of the paths matched by the following two path patterns:

(:X&Y)

(:X&A)-[:R]->(:B&Y)

296
Values and types
Cypher supports a range of data values. When writing Cypher queries, it is not possible to declare a data
type. Rather, Cypher will automatically infer the data type of a given value.

More information about the data values and types supported by Cypher can be found in the following
sections:

• Property, structural, and constructed values

• Temporal values

• Spatial values

• Working with null

• Lists

• Maps

• Casting data values

• Type predicate expressions

Property, structural, and constructed values


Cypher provides first class support for a number of data value types. These fall into the following three
categories: property, structural, and constructed. This section will first provide a brief overview of each
type, and then go into more detail about the property data type.

Property types
A property type value is one that can be stored as a node or relationship property.

The following data types are included in the property types category: BOOLEAN, DATE, DURATION, FLOAT,
INTEGER, LIST, LOCAL DATETIME, LOCAL TIME, POINT, STRING, ZONED DATETIME, and ZONED TIME.

• Property types can be returned from Cypher queries.

• Property types can be used as parameters.

• Property types can be stored as properties.

• Property types can be constructed with Cypher literals.

Homogeneous lists of simple types can be stored as properties, although lists in general (see Constructed
types) cannot be stored as properties. Lists stored as properties cannot contain null values.

Cypher also provides pass-through support for byte arrays, which can be stored as property values. Byte
arrays are supported for performance reasons, since using Cypher’s generic data type, LIST<INTEGER>
(where each INTEGER has a 64-bit representation), would be too costly. However, byte arrays are not
considered a first class data type by Cypher, so they do not have a literal representation.

297
Structural types
The following data types are included in the structural types category: NODE, RELATIONSHIP, and PATH.

• Structural types can be returned from Cypher queries.

• Structural types cannot be used as parameters.

• Structural types cannot be stored as properties.

• Structural types cannot be constructed with Cypher literals.

The NODE data type includes: id, label(s), and a map of properties. Note that labels are not values, but a
form of pattern syntax.

The RELATIONSHIP data type includes: id, relationship type, a map of properties, start node id, and end node
id.

The PATH data type is an alternating sequence of nodes and relationships.

Nodes, relationships, and paths are returned as a result of pattern matching. In Neo4j, all

 relationships have a direction. However, you can have the notion of undirected
relationships at query time.

Constructed types
The following data types are included in the constructed types category: LIST and MAP.

• Constructed types can be returned from Cypher queries.

• Constructed types can be used as parameters.

• Constructed types cannot be stored as properties (with the exception of homogenous lists).

• Constructed types can be constructed with Cypher literals.

The LIST data type can be either a homogenous collection of simple values, or a heterogeneous, ordered
collection of values, each of which can have any property, structural or constructed type.

The MAP data type is a heterogeneous, unordered collection of (Key, Value) pairs, where Key is a literal and
Value can have any property, structural, or constructed type.

Constructed type values can also contain null. For more details, see working with null.

Types and their synonyms


The table below shows the types and their syntactic synonyms.

These types (and their synonyms) can be used in type predicate expressions and in property type
constraints. They are also returned as a STRING value when using the valueType() function.

However, not all types can be used in all places.

298
Type Synonyms

ANY ANY VALUE

BOOLEAN BOOL

DATE

DURATION

FLOAT

INTEGER INT, SIGNED INTEGER

LIST<INNER_TYPE> ARRAY<INNER_TYPE>, INNER_TYPE LIST, INNER_TYPE


ARRAY

LOCAL DATETIME TIMESTAMP WITHOUT TIME ZONE, TIMESTAMP WITHOUT


TIMEZONE

LOCAL TIME TIME WITHOUT TIME ZONE, TIME WITHOUT TIMEZONE

MAP

NODE ANY NODE, VERTEX, ANY VERTEX

NOTHING

NULL

PATH

POINT

PROPERTY VALUE ANY PROPERTY VALUE

RELATIONSHIP ANY RELATIONSHIP, EDGE, ANY EDGE

STRING VARCHAR

ZONED DATETIME TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH


TIMEZONE

ZONED TIME TIME WITH TIME ZONE, TIME WITH TIMEZONE

299
Type Synonyms

INNER_TYPE_1 | INNER_TYPE_2… ANY<INNER_TYPE_1 | INNER_TYPE_2…>

All Cypher types contain the null value. To make them not nullable, NOT NULL can be appended to the end
of the type (e.g. BOOLEAN NOT NULL, LIST<FLOAT NOT NULL>). A shorthand syntax equivalent, introduced in
Neo4j 5.14, for NOT NULL is to use an exclamation mark ! (e.g. INTEGER!, LIST<STRING!>). Note that closed
dynamic types (INNER_TYPE_1 | INNER_TYPE_2…) cannot be appended with NOT NULL: all inner types must
be nullable, or all appended with NOT NULL.

Type Normalization
Cypher runs a normalization algorithm on all input types, simplifying the given type to a deterministic
representation for equivalent types. Types are simplified to their default name (e.g. BOOL is simplified to
BOOLEAN). Encompassing types are absorbed (e.g. LIST<BOOLEAN> | LIST<BOOLEAN | INTEGER> is
normalized to LIST<BOOLEAN | INTEGER>). Types are also ordered.

The type PROPERTY VALUE is expanded to a closed dynamic union of all valid property types, and if all types
are represented, then the normalization would simplify to ANY.

For example, given the closed dynamic type BOOL | LIST<INT> | BOOLEAN | LIST<FLOAT | INT>, the
normalized type would be: BOOLEAN | LIST<INTEGER | FLOAT>.

This normalization is run on types used in type predicate expressions, and in property type constraints.
Type normalization is also used to ensure the consistency of the output for the valueType() function.

Ordering of types
The ordering of types is as follows:

• Predefined types
◦ NOTHING

◦ NULL

◦ BOOLEAN

◦ STRING

◦ INTEGER

◦ FLOAT

◦ DATE

◦ LOCAL TIME

◦ ZONED TIME

◦ LOCAL DATETIME

◦ ZONED DATETIME

◦ DURATION

300
◦ POINT

◦ NODE

◦ RELATIONSHIP

• Constructed types
◦ MAP

◦ LIST<INNER_TYPE> (ordered by the inner type)

◦ PATH

• Dynamic union types


◦ INNER_TYPE_1 \| INNER_TYPE_2… (ordered by specific rules for closed dynamic union type)

◦ ANY

Subtypes are always ordered before any enclosing types (e.g. LIST<INTEGER> is ordered before
LIST<INTEGER | FLOAT>). This also means that the NOT NULL variants of each type comes before the
nullable variant.

The order between two closed dynamic unions A and B is determined as followed:

• If A has fewer inner types than B, A is ordered first.

• If A and B have the same number of inner types, they are ordered according to the order of the first
inner type that differ (lexicographic order).

The resulting order is deterministic.

Property type details


The below table provides more detailed information about the various property types that Cypher
supports. Note that Cypher types are implemented using Java, and that below table references Java value
constants.

Type Min. value Max. value Precision

BOOLEAN False True -

DATE -999_999_999-01-01 +999_999_999-12-31 Days

DURATION P-292471208677Y-6M-15DT-15H- P292471208677Y6M15DT15H36M32.9 Nanoseconds


36M-32S 99999999S
[1]
FLOAT Double.MIN_VALUE Double.MAX_VALUE 64 bit

INTEGER Long.MIN_VALUE Long.MAX_VALUE 64 bit

LOCAL DATETIME -999_999_999-01-01T00:00:00 +999_999_999-12- Nanoseconds


31T23:59:59.999999999

LOCAL TIME 00:00:00 23:59:59.999999999 Nanoseconds

301
Type Min. value Max. value Precision

POINT Cartesian: (-Double.MAX_VALUE, Cartesian: (Double.MAX_VALUE, The precision of


-Double.MAX_VALUE) Double.MAX_VALUE) each coordinate
of the POINT is
Cartesian_3D: ( Cartesian_3D: (Double.MAX_VALUE, 64 bit as they
-Double.MAX_VALUE, Double.MAX_VALUE, are floats.
-Double.MAX_VALUE, Double.MAX_VALUE)
-Double.MAX_VALUE)
WGS_84: (180, 90)
WGS_84: (-180, -90)
WGS_84_3D: (180, 90,
WGS_84_3D: (-180, -90, Double.MAX_VALUE)
-Double.MAX_VALUE)

STRING - - -

ZONED DATETIME -999_999_999-01 +999_999_999-12- Nanoseconds


-01T00:00:00+18:00 31T23:59:59.999999999-18:00

ZONED TIME 00:00:00+18:00 23:59:59.999999999-18:00 Nanoseconds

Java value details

Name Value

Double.MAX_VALUE 1.7976931348623157e+308

Double.MIN_VALUE 4.9e-324

Long.MAX_VALUE 2^63-1

Long.MIN_VALUE -2^63

Temporal values
Cypher has built-in support for handling temporal values, which can be stored as properties on nodes and
relationships in Neo4j databases. This section will discuss how Cypher handles time zones, before
exploring temporal values in more detail.

• Refer to Temporal functions - instant types for information regarding temporal


functions allowing for the creation and manipulation of temporal values.

 • Refer to Temporal operators for information regarding temporal operators.

• Refer to Ordering and comparison of values for information regarding the


comparison and ordering of temporal values.

Temporal value types


The following table lists the temporal value types and their supported components:

Type Date support Time support Time zone support

DATE 

302
Type Date support Time support Time zone support

LOCAL TIME 

ZONED TIME  

LOCAL DATETIME  

ZONED DATETIME   

DURATION - - -

DATE, LOCAL TIME, ZONED TIME, LOCAL DATETIME, and ZONED DATETIME are temporal instant types. A temporal
instant value expresses a point in time with varying degrees of precision.

By contrast, DURATION is not a temporal instant type. A DURATION represents a temporal amount, capturing
the difference in time between two instants, and can be negative. DURATION captures the amount of time
between two instants, it does not capture a start time and end time.

Starting from Neo4j 5.9, some temporal types have been renamed. The table below shows the current as
well as the old names of the temporal types.

Type Old type name

DATE Date

LOCAL TIME LocalTime

ZONED TIME Time

LOCAL DATETIME LocalDateTime

ZONED DATETIME DateTime

DURATION Duration

Time zones
Time zones are represented either as an offset from UTC, or as a logical identifier of a named time zone
(these are based on the IANA time zone database). In either case, the time is stored as UTC internally, and
the time zone offset is only applied when the time is presented. This means that temporal instants can be
ordered without taking time zone into account. If, however, two times are identical in UTC, then they are
ordered by timezone.

When creating a time using a named time zone, the offset from UTC is computed from the rules in the time
zone database to create a time instant in UTC, and to ensure the named time zone is a valid one.

It is possible for time zone rules to change in the IANA time zone database. For example, there could be
alterations to the rules for daylight savings time in a certain area. If this occurs after the creation of a
temporal instant, the presented time could differ from the originally-entered time, insofar as the local
timezone is concerned. However, the absolute time in UTC would remain the same.

303
There are three ways of specifying a time zone in Cypher:

• Specifying the offset from UTC in hours and minutes (ISO 8601).

• Specifying a named time zone.

• Specifying both the offset and the time zone name (with the requirement that these match).

See specifying time zones for examples.

The named time zone form uses the rules of the IANA time zone database to manage daylight savings
time (DST).

The default time zone of the database can be configured using the configuration option
db.temporal.timezone. This configuration option influences the creation of temporal types for the
following functions:

• Getting the current date and time without specifying a time zone.

• Creating a temporal type from its components without specifying a time zone.

• Creating a temporal type by parsing a STRING without specifying a time zone.

• Creating a temporal type by combining or selecting values that do not have a time zone component,
and without specifying a time zone.

• Truncating a temporal value that does not have a time zone component, and without specifying a time
zone.

Temporal instants

Specifying temporal instants


A temporal instant consists of three parts; the date, the time, and the timezone. These parts can be
combined to produce the various temporal value types. The character T is a literal character.

Temporal instant type Composition of parts

DATE <date>

LOCAL TIME <time> or T<time>

ZONED TIME <time><timezone> or T<time><timezone>

LOCAL DATETIME* <date>T<time>

ZONED DATETIME* <date>T<time><timezone>

*When date and time are combined, date must be complete; i.e. fully identify a particular day.

Specifying dates

Component Format Description

Year YYYY Specified with at least four digits


(special rules apply in certain cases).

304
Component Format Description

Month MM Specified with a double digit number


from 01 to 12.

Week ww Always prefixed with W and specified


with a double digit number from 01 to
53.

Quarter q Always prefixed with Q and specified


with a single digit number from 1 to 4.

Day of the month DD Specified with a double digit number


from 01 to 31.

Day of the week D Specified with a single digit number


from 1 to 7.

Day of the quarter DD Specified with a double digit number


from 01 to 92.

Ordinal day of the year DDD Specified with a triple digit number from
001 to 366.

If the year is before 0000 or after 9999, the following additional rules apply:

• Minus sign, - must prefix any year before 0000, (e.g. -3000-01-01).

• Plus sign, + must prefix any year after 9999, (e.g. +11000-01-01).

• The year must be separated with - from the next component:


◦ if the next component is month, (e.g. +11000-01).

◦ if the next component is day of the year, (e.g. +11000-123).

If the year component is prefixed with either - or +, and is separated from the next component, Year is
allowed to contain up to nine digits. Thus, the allowed range of years is between -999,999,999 and
+999,999,999. For all other cases, i.e. the year is between 0000 and 9999 (inclusive), Year must have
exactly four digits (the year component is interpreted as a year of the Common Era (CE)).

The following formats are supported for specifying dates:

Format Description Example Interpretation of example

YYYY-MM-DD Calendar date: Year-Month- 2015-07-21 2015-07-21


Day

YYYYMMDD Calendar date: Year-Month- 20150721 2015-07-21


Day

YYYY-MM Calendar date: Year-Month 2015-07 2015-07-01

YYYYMM Calendar date: Year-Month 201507 2015-07-01

YYYY-Www-D Week date: Year-Week-Day 2015-W30-2 2015-07-21

YYYYWwwD Week date: Year-Week-Day 2015W302 2015-07-21

YYYY-Www Week date: Year-Week 2015-W30 2015-07-20

305
Format Description Example Interpretation of example

YYYYWww Week date: Year-Week 2015W30 2015-07-20

YYYY-Qq-DD Quarter date: Year-Quarter- 2015-Q2-60 2015-05-30


Day

YYYYQqDD Quarter date: Year-Quarter- 2015Q260 2015-05-30


Day

YYYY-Qq Quarter date: Year-Quarter 2015-Q2 2015-04-01

YYYYQq Quarter date: Year-Quarter 2015Q2 2015-04-01

YYYY-DDD Ordinal date: Year-Day 2015-202 2015-07-21

YYYYDDD Ordinal date: Year-Day 2015202 2015-07-21

YYYY Year 2015 2015-01-01

The smallest components can be omitted. Cypher will assume omitted components to have their lowest
possible value. For example, 2013-06 will be interpreted as being the same date as 2013-06-01.

Specifying times

Component Format Description

Hour HH Specified with a double digit number


from 00 to 23.

Minute MM Specified with a double digit number


from 00 to 59.

Second SS Specified with a double digit number


from 00 to 59.

fraction sssssssss Specified with a number from 0 to


999999999. It is not required to specify
leading zeros. fraction is an optional,
sub-second component of Second. This
can be separated from Second using
either a full stop (.) or a comma (,). The
fraction is in addition to the two digits
of Second.

Cypher does not support leap seconds; UTC-SLS (UTC with Smoothed Leap Seconds) is used to manage
the difference in time between UTC and TAI (International Atomic Time).

The following formats are supported for specifying times:

Format Description Example Interpretation of example

HH:MM:SS.sssssssss Hour:Minute:Second.fractio 21:40:32.142 21:40:32.142


n

HHMMSS.sssssssss Hour:Minute:Second.fractio 214032.142 21:40:32.142


n

HH:MM:SS Hour:Minute:Second 21:40:32 21:40:32.000

306
Format Description Example Interpretation of example

HHMMSS Hour:Minute:Second 214032 21:40:32.000

HH:MM Hour:Minute 21:40 21:40:00.000

HHMM Hour:Minute 2140 21:40:00.000

HH Hour 21 21:00:00.000

The smallest components can be omitted. For example, a time may be specified with Hour and Minute,
leaving out Second and fraction. On the other hand, specifying a time with Hour and Second, while leaving
out Minute, is not possible.

Specifying time zones

The time zone is specified in one of the following ways:

• As an offset from UTC.

• Using the Z shorthand for the UTC (±00:00) time zone.

When specifying a time zone as an offset from UTC, the rules below apply:

• The time zone always starts with either a plus (+) or minus (-) sign.
◦ Positive offsets, i.e. time zones beginning with +, denote time zones east of UTC.

◦ Negative offsets, i.e. time zones beginning with -, denote time zones west of UTC.

• A double-digit hour offset follows the +/- sign.

• An optional double-digit minute offset follows the hour offset, optionally separated by a colon (:).

• The time zone of the International Date Line is denoted either by +12:00 or -12:00, depending on
country.

When creating values of the ZONED DATETIME temporal instant type, the time zone may also be specified
using a named time zone, using the names from the IANA time zone database. This may be provided
either in addition to, or in place of the offset. The named time zone is given last and is enclosed in square
brackets ([]). Should both the offset and the named time zone be provided, the offset must match the
named time zone.

The following formats are supported for specifying time zones:

Format Description Example Supported for Supported for


ZONED DATETIME ZONED TIME

Z UTC Z  

±HH:MM Hour:Minute +09:30  

±HH:MM[ZoneName] Hour:Minute[ZoneName] +08:45[Australia/Eucla] 

±HHMM Hour:Minute +0100  

±HHMM[ZoneName] Hour:Minute[ZoneName] +0200[Africa/Johannesburg] 

±HH Hour -08  

307
Format Description Example Supported for Supported for
ZONED DATETIME ZONED TIME

±HH[ZoneName] Hour[ZoneName] +08[Asia/Singapore] 

[ZoneName] [ZoneName] [America/Regina] 

Components of temporal instants


Components of temporal instant values can be accessed as properties.

Components of temporal instant values and where they are supported

Component Description Type Range/Format DATE ZONED LOCAL ZONED LOCAL


DATETIME DATETIME TIME TIME

instant.year The year INTEGER At least 4 digits.   


component For more
represents the
information, see
astronomical year
number of the
the rules for
instant.
[2] using the Year
component.

instant.quarter The quarter-of- INTEGER 1 to 4.   


the-year
component.

instant.month The month-of-the- INTEGER 1 to 12.   


year component.

instant.week The week-of-the- INTEGER 1 to 53.   


[3]
year component.

instant.weekYear The year that the INTEGER At least 4 digits.   


week-of-year For more
component
[4] information, see
belongs to.
the rules for
using the Year
component.

instant.dayOfQuar The day-of-the- INTEGER 1 to 92.   


ter quarter
component.

instant.quarterDa The day-of-the- INTEGER 1 to 92.   


y quarter component
(alias for
instant.dayOfQuar
ter).

instant.day The day-of-the- INTEGER 1 to 31.   


month component.

instant.ordinalDa The day-of-the- INTEGER 1 to 366.   


y year component.

308
Component Description Type Range/Format DATE ZONED LOCAL ZONED LOCAL
DATETIME DATETIME TIME TIME

instant.dayOfWeek The day-of-the- INTEGER 1 to 7.   


week component
(the first day of the
week is Monday).

instant.weekDay The day-of-the- INTEGER 1 to 7.   


week component
(alias for
instant.dayOfWeek
).

instant.hour The hour INTEGER 0 to 23.    


component.

instant.minute The minute INTEGER 0 to 59.    


component.

instant.second The second INTEGER 0 to 59.    


[5]
component.

instant.milliseco The millisecond INTEGER 0 to 999.    


nd component.

instant.microseco The microsecond INTEGER 0 to 999999.    


nd component.

instant.nanosecon The nanosecond INTEGER 0 to 999999999.    


d component.

instant.timezone The timezone STRING Depending on how  


component. the time zone was
specified, this is
either a time zone
name or an offset
from UTC in the
format ±HHMM.

instant.offset The timezone STRING In the format ±HHMM.  


offset.

instant.offsetMin The timezone INTEGER -1080 to +1080.  


utes offset in minutes.

instant.offsetSec The timezone INTEGER -64800 to +64800.  


onds offset in seconds.

instant.epochMill The number of INTEGER Positive for 


is milliseconds instants after and
between 1970-01- negative for
01T00:00:00+0000 instants before
and the instant.
[6] 1970-01-
01T00:00:00+0000.

309
Component Description Type Range/Format DATE ZONED LOCAL ZONED LOCAL
DATETIME DATETIME TIME TIME

instant.epochSeco The number of INTEGER Positive for 


nds seconds between instants after and
1970-01- negative for
01T00:00:00+0000 instants before
[7]
and the instant. 1970-01-
01T00:00:00+0000.

Examples
Below are examples of parsing instant values using various temporal functions. More information about
these temporal functions can be found here.

Example 56. datetime

Parsing a ZONED DATETIME using the calendar date format:

Query

RETURN datetime('2015-06-24T12:50:35.556+0100') AS theDateTime

Result

theDateTime

2015-06-24T12:50:35.556+01:00

Rows: 1

Example 57. localdatetime

Parsing a LOCAL DATETIME using the ordinal date format:

Query

RETURN localdatetime('2015185T19:32:24') AS theLocalDateTime

Result

theLocalDateTime

2015-07-04T19:32:24

Rows: 1

310
Example 58. date

Parsing a DATE using the week date format:

Query

RETURN date('+2015-W13-4') AS theDate

Result

theDate

2015-03-26

Rows: 1

Example 59. time

Parsing a ZONED TIME:

Query

RETURN time('125035.556+0100') AS theTime

Result

theTime

12:50:35.556000000+01:00

Rows: 1

Example 60. localtime

Parsing a LOCAL TIME:

Query

RETURN localtime('12:50:35.556') AS theLocalTime

Result

theLocalTime

12:50:35.556000000

Rows: 1

311
Example 61. date

The following query shows how to get the components of a DATE value:

Query

WITH date({year: 1984, month: 10, day: 11}) AS d


RETURN d.year, d.quarter, d.month, d.week, d.weekYear, d.day, d.ordinalDay, d.dayOfWeek,
d.dayOfQuarter

Result

d.year d.quarter d.month d.week d.weekYear d.day d.ordinalDa d.dayOfWe d.dayOfQu


y ek arter

1984 4 10 41 1984 11 285 4 11

Rows: 1

Example 62. datetime

The following query shows how to get the date-related components of a ZONED DATETIME value:

Query

WITH datetime({
year: 1984, month: 11, day: 11,
hour: 12, minute: 31, second: 14, nanosecond: 645876123,
timezone: 'Europe/Stockholm'
}) AS d
RETURN d.year, d.quarter, d.month, d.week, d.weekYear, d.day, d.ordinalDay, d.dayOfWeek,
d.dayOfQuarter

Result

d.year d.quarter d.month d.week d.weekYear d.day d.ordinalDa d.dayOfWe d.dayOfQu


y ek arter

1984 4 11 45 1984 11 316 7 42

Rows: 1

312
Example 63. datetime

The following query shows how to get the time-related components of a ZONED DATETIME value:

Query

WITH datetime({
year: 1984, month: 11, day: 11,
hour: 12, minute: 31, second: 14, nanosecond: 645876123,
timezone: 'Europe/Stockholm'
}) AS d
RETURN d.hour, d.minute, d.second, d.millisecond, d.microsecond, d.nanosecond

Result

d.hour d.minute d.second d.millisecond d.microsecond d.nanosecond

12 31 14 645 645876 645876123

Rows: 1

Example 64. datetime

The following query shows how to get the epoch time and timezone-related components of a ZONED
DATETIME value:

Query

WITH datetime({
year: 1984, month: 11, day: 11,
hour: 12, minute: 31, second: 14, nanosecond: 645876123,
timezone: 'Europe/Stockholm'
}) AS d
RETURN d.timezone, d.offset, d.offsetMinutes, d.epochSeconds, d.epochMillis

Result

d.timezone d.offset d.offsetMinutes d.epochSeconds d.epochMillis

"Europe/Stockholm" "+01:00" 60 469020674 469020674645

Rows: 1

313
Example 65. date.truncate

Get the first day of the current year:

Query

RETURN date.truncate('year') AS day

Result

day

2022-01-01

Rows: 1

Example 66. date.truncate

Get the date of the Thursday in the week of a specific date:

Query

RETURN date.truncate('week', date('2019-10-01'), {dayOfWeek: 4}) AS thursday

Result

thursday

2019-10-03

Rows: 1

Durations

Specifying durations
A DURATION represents a temporal amount, capturing the difference in time between two instants, and can
be negative.

The specification of a DURATION is prefixed with a P, and can use either a unit-based form or a date-and-
time-based form:

• Unit-based form: P[nY][nM][nW][nD][T[nH][nM][nS]]


◦ The square brackets ([]) denote an optional component (components with a zero value may be
omitted).
◦ The n denotes a numeric value within the bounds of a 64-bit integer.

◦ The value of the last — and smallest — component may contain a decimal fraction.

◦ Each component must be suffixed by a component identifier denoting the unit.

◦ The unit-based form uses M as a suffix for both months and minutes. Therefore, time parts must

314
always be preceded with T, even when no components of the date part are given.
◦ The maximum total length of a duration is bounded by the number of seconds that can be held in a
64-bit integer.

• Date-and-time-based form: P<date>T<time>.


◦ Unlike the unit-based form, this form requires each component to be within the bounds of a valid
LOCAL DATETIME.

The following table lists the component identifiers for the unit-based form:

Component identifier Description Comments

Y Years

M Months Must be specified before T.

W Weeks

D Days

H Hours

M Minutes Must be specified after T.

S Seconds

Components of durations
A DURATION can have several components, each categorized into Months, Days, and Seconds groups.

Components of DURATION values are truncated within their component groups as follows:

First order DURATION components

Component Group Component Description Type Details

Months duration.years The total number of INTEGER Each set of 4 quarters is


years. counted as 1 year; each set of
12 months is counted as 1 year.

duration.quarters The total number of INTEGER Each year is counted as 4


quarters. quarters; each set of 3 months
is counted as 1 quarter.

duration.months The total number of INTEGER Each year is counted as 12


months. months; each_quarter_ is
counted as 3 months.

Days duration.weeks The total number of INTEGER Each set of 7 days is counted
weeks. as 1 week.

duration.days The total number of INTEGER Each week is counted as 7


days. days.

315
Component Group Component Description Type Details

Seconds duration.hours The total number of INTEGER Each set of 60 minutes is


hours. counted as 1 hour; each set of
3600 seconds is counted as 1
hour.

duration.minutes The total number of INTEGER Each hour is counted as 60


minutes. minutes; each set of 60
seconds is counted as 1
minute.

duration.seconds The total number of INTEGER Each hour is counted as 3600


seconds. seconds; each minute is
counted as 60 seconds.

duration.milliseconds The total number of INTEGER Each set of 1000 milliseconds is


milliseconds counted as 1 second.

duration.microseconds The total number of INTEGER Each millisecond is counted as


microseconds. 1000 microseconds.

duration.nanoseconds The total number of INTEGER Each microsecond is counted


nanoseconds. as 1000 nanoseconds.

Please note that:

• Cypher uses UTC-SLS when handling leap seconds.

• There are not always 24 hours in 1 day; when switching to/from daylight savings
 time, a day can have 23 or 25 hours.

• There are not always the same number of days in a month.

• Due to leap years, there are not always the same number of days in a year.

It is also possible to access the second order components of a component group bounded by the first order
component of the group:

Second order DURATION components

Component Component Group Description Type

duration.quartersOfYear Months The number of quarters in the group INTEGER


that do not make a whole year.

duration.monthsOfYear Months The number of months in the group that INTEGER


do not make a whole year.

duration.monthsOfQuarter Months The number of months in the group that INTEGER


do not make a whole quarter.

duration.daysOfWeek Days The number of days in the group that do INTEGER


not make a whole week.

duration.minutesOfHour Seconds The number of minutes in the group that INTEGER


do not make a whole hour.

316
Component Component Group Description Type

duration.secondsOfMinute Seconds The number of seconds in the group INTEGER


that do not make a whole minute.

duration.millisecondsOfSecond Seconds The number of milliseconds in the group INTEGER


that do not make a whole second.

duration.microsecondsOfSecond Seconds The number of microseconds in the INTEGER


group that do not make a whole second.

duration.nanosecondsOfSecond Seconds The number of nanoseconds in the INTEGER


group that do not make a whole second

Examples
Below are examples of parsing durations using the duration() function. More information can be found
here.

Example 67. Return a duration of 14 days, 16 hours, and 12 minutes

Query

RETURN duration('P14DT16H12M') AS theDuration

Result

theDuration

P14DT16H12M

Rows: 1

Example 68. Return a duration of 5 months, 1 day, and 12 hours

Query

RETURN duration('P5M1.5D') AS theDuration

Result

theDuration

P5M1DT12H

Rows: 1

317
Example 69. Return a duration of 45 seconds

Query

RETURN duration('PT0.75M') AS theDuration

Result

theDuration

PT45S

Rows: 1

Example 70. Return a duration of 2 weeks, 3 days, and 12 hours

Query

RETURN duration('P2.5W') AS theDuration

Result

theDuration

P17DT12H

Rows: 1

Example 71. Get the month-based components of a DURATION value

Query

WITH duration({years: 1, months: 5, days: 111, minutes: 42}) AS d


RETURN d.years, d.quarters, d.quartersOfYear, d.months, d.monthsOfYear, d.monthsOfQuarter

Result

d.years d.quarters d.quartersOfYear d.months d.monthsOfYear d.monthsOfQuart


er

1 5 1 17 5 2

Rows: 1

d.quarters has a value of 5 because the year of the duration has four quarters and there is another
quarter in the five months. d.months has a value of 17 because it adds the 12 months in the year of
the duration to the five months. d.quartersOfYear is the remaining quarter, counting towards the
next full year. Similarly, d.monthsOfYear and d.monthsOfQuarter count towards the next full year and
quarter respectively. See tables First order DURATION components and Second order DURATION
components in Components of durations.

318
Example 72. Get the days-based components of a DURATION value

Query

WITH duration({months: 5, days: 25, hours: 1}) AS d


RETURN d.weeks, d.days, d.daysOfWeek

Result

d.weeks d.days d.daysOfWeek

3 25 4

Rows: 1

d.weeks has a value of 3 because the 25 days from the query are three full weeks (or 21 days).
d.daysOfWeek are the remaining days, counting towards the next full week. See tables First order
DURATION components and Second order DURATION components in Components of durations.

Example 73. Get the first order seconds-based components of a DURATION value

Query

WITH duration({
years: 1, months:1, days:1, hours: 1,
minutes: 1, seconds: 1, nanoseconds: 111111111
}) AS d
RETURN d.hours, d.minutes, d.seconds, d.milliseconds, d.microseconds, d.nanoseconds

Result

d.hours d.minutes d.seconds d.milliseconds d.microseconds d.nanoseconds

1 61 3661 3661111 3661111111 3661111111111

Rows: 1

d.minutes is the sum of 60 minutes of the hour and the one minute from the query as both
duration.hours and duration.minutes are both seconds-based components. Similarly, d.seconds,
d.milliseconds, d.microseconds and d.nanoseconds are sum values of the relevant seconds-based
components from the query.

d.hours does not take the day from the query into account because duration.days is a days-based
component.

See table First order DURATION components in Components of durations.

319
Example 74. Get the second order seconds-based components of a DURATION value

Query

WITH duration({
years: 1, months:1, days:1,
hours: 1, minutes: 1, seconds: 1, nanoseconds: 111111111
}) AS d
RETURN d.minutesOfHour, d.secondsOfMinute, d.millisecondsOfSecond, d.microsecondsOfSecond,
d.nanosecondsOfSecond

Result

d.minutesOfHour d.secondsOfMinute d.millisecondsOfSeco d.microsecondsOfSec d.nanosecondsOfSec


nd ond ond

1 1 111 111111 111111111

Rows: 1

The returned values all count towards the next full hour, minute or second respectively. For example,
d.microsecondsOfSecond has a value of 111111 because it is the 111111111 nanoseconds from the
query in microseconds (rounded down) but it is not another full second.

See table Second order DURATION components in Components of durations.

Example 75. Create a duration representing 1.5 days

Query

RETURN duration({days: 1, hours: 12}) AS theDuration

Result

theDuration

P1DT12H

Rows: 1

Example 76. Compute the DURATION between two temporal instants

Query

RETURN duration.between(date('1984-10-11'), date('2015-06-24')) AS theDuration

Result

theDuration

P30Y8M13D

Rows: 1

320
Example 77. Compute the number of days between two DATE values

Query

RETURN duration.inDays(date('2014-10-11'), date('2015-08-06')) AS theDuration

Result

theDuration

P299D

Rows: 1

Example 78. Get the DATE of the last day of the next month

Query

RETURN date.truncate('month', date() + duration('P2M')) - duration('P1D') AS lastDay

Result

lastDay

2022-07-31

Rows: 1

Example 79. Add a DURATION to a DATE

Query

RETURN time('13:42:19') + duration({days: 1, hours: 12}) AS theTime

Result

theTime

01:42:19.000000000+00:00

Rows: 1

321
Example 80. Add two DURATION values

Query

RETURN duration({days: 2, hours: 7}) + duration({months: 1, hours: 18}) AS theDuration

Result

theDuration

P1M2DT25H

Rows: 1

Example 81. Multiply a DURATION by a number

Query

RETURN duration({hours: 5, minutes: 21}) * 14 AS theDuration

Result

theDuration

PT74H54M

Rows: 1

Example 82. Divide a DURATION by a number

Query

RETURN duration({hours: 3, minutes: 16}) / 2 AS theDuration

Result

theDuration

PT1H38M

Rows: 1

322
Example 83. Examine whether two instants are less than one day apart

Query

WITH
datetime('2015-07-21T21:40:32.142+0100') AS date1,
datetime('2015-07-21T17:12:56.333+0100') AS date2
RETURN
CASE
WHEN date1 < date2 THEN date1 + duration("P1D") > date2
ELSE date2 + duration("P1D") > date1
END AS lessThanOneDayApart

Result

lessThanOneDayApart

true

Rows: 1

Example 84. Return the abbreviated name of the current month

Query

RETURN ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"][date
().month-1] AS month

Result

month

"Jun"

Rows: 1

Temporal indexing
All temporal types can be indexed, and thereby support exact lookups for equality predicates. Indexes for
temporal instant types additionally support range lookups.

Spatial values
Cypher has built-in support for handling spatial values (POINT values), which can be stored as properties on
nodes and relationships in Neo4j databases.

This section begins with an explanation of the POINT type. It then proceeds to discuss Cypher’s support of
Coordinate Reference Systems, and how to work with spatial instants in Cypher, including how spatial
point instants work with Cypher indexing. Finally, it briefly explains comparability and orderability with
regard to spatial instants.

323
For more information about spatial functions, allowing for the creation and manipulation
of spatial values, see the section on Spatial functions.
 For more information about the comparison and ordering of spatial values, see the
section on the ordering and comparison of values.

The POINT type


Neo4j supports the POINT type for values of spatial geometry.

Values with the POINT type have the following characteristics:

• Each point can have either 2 or 3 dimensions. This means it contains either 2 or 3 64-bit FLOAT values,
which together are called the Coordinate.

• Each point will also be associated with a specific Coordinate Reference System (CRS) that determines
the meaning of the values in the Coordinate.

• Instances of POINT and LIST<POINT> can be assigned to node and relationship properties.

• Nodes and relationships with POINT or LIST<POINT> properties can be indexed using a point index. This
is true for all CRSs (and for both 2D and 3D).

• The distance function will work on points in all CRS and in both 2D and 3D, but only if the two points
have the same CRS (and therefore also same dimension).

Coordinate Reference Systems


Four Coordinate Reference Systems (CRS) are supported, each of which falls within one of two types:
geographic coordinates, modeling points on the earth, or Cartesian coordinates, modeling points in
euclidean space:

Data within different coordinate systems are entirely incomparable, and cannot be implicitly converted
from one to the other. This is true even if they are both Cartesian or both geographic but of a different
dimension. For example, if you search for 3D points using a 2D range, you will get no results. However,
they can be ordered, as discussed in more detail in the section about ordering and comparison of values.

Geographic coordinate reference systems


Two Geographic Coordinate Reference Systems (CRS) are supported, modeling points on the earth:

• WGS 84 2D
◦ A 2D geographic point in the WGS 84 CRS is specified in one of two ways:

▪ longitude and latitude (if these are specified, and the crs is not, then the crs is assumed to
be WGS-84).
▪ x and y (in this case the crs must be specified, or will be assumed to be Cartesian).

◦ Specifying this CRS can be done using either the name 'wgs-84' or the SRID 4326 as described in
point() - WGS 84 2D.

• WGS 84 3D

324
◦ A 3D geographic point in the WGS 84 CRS is specified one of in two ways:

▪ longitude, latitude and either height or z (if these are specified, and the crs is not, then the
crs is assumed to be WGS-84-3D).
▪ x, y and z (in this case the crs must be specified, or will be assumed to be Cartesian-3D).

◦ Specifying this CRS can be done using either the name 'wgs-84-3d' or the SRID 4979 as
described in point() - WGS 84 3D.

Converting coordinate units


The units of the latitude and longitude fields are in decimal degrees, and need to be specified as floating
point numbers using Cypher literals. It is not possible to use any other format, such as 'degrees, minutes,
seconds'. The units of the height field are in meters. When geographic points are passed to the distance
function, the result will always be in meters. If the coordinates are in any other format or unit than those
supported, it is necessary to explicitly convert them.

For example, if the incoming $height is a STRING field in kilometers, it would be necessary to add height:
toFloat($height) * 1000 to the query. Likewise if the results of the distance function are expected to be
returned in kilometers, an explicit conversion is required. The below query is an example of this conversion:

Query

WITH
point({latitude: toFloat('13.43'), longitude: toFloat('56.21')}) AS p1,
point({latitude: toFloat('13.10'), longitude: toFloat('56.41')}) AS p2
RETURN toInteger(point.distance(p1, p2)/1000) AS km

Result

km

42

Rows: 1

Cartesian coordinate reference systems


Two Cartesian Coordinate Reference Systems (CRS) are supported, modeling points in euclidean space:

• Cartesian 2D
◦ A 2D point in the Cartesian CRS is specified with a map containing x and y coordinate values

◦ Specifying this CRS can be done using either the name 'cartesian' or the SRID 7203 as described
in point() - Cartesian 2D

• Cartesian 3D
◦ A 3D point in the Cartesian CRS is specified with a map containing x, y and z coordinate values

◦ Specifying this CRS can be done using either the name 'cartesian-3d' or the SRID 9157 as
described in point() - Cartesian 3D)

The units of the x, y, and z fields are unspecified. This means that when two Cartesian points are passed to
the distance function, the resulting value will be in the same units as the original coordinates. This is true

325
for both 2D and 3D points, as the Pythagoras equation used is generalized to any number of dimensions.
However, just as you cannot compare geographic points to Cartesian points, you cannot calculate the
distance between a 2D point and a 3D point. If you need to do that, explicitly transform the one type into
the other. For example:

Query

WITH
point({x: 3, y: 0}) AS p2d,
point({x: 0, y: 4, z: 1}) AS p3d
RETURN
point.distance(p2d, p3d) AS bad,
point.distance(p2d, point({x: p3d.x, y: p3d.y})) AS good

Result

bad good

<null> 5.0

Rows: 1

Spatial instants
All POINT types are created from two components:

• The Coordinate containing either 2 or 3 FLOAT values (64-bit).

• The Coordinate Reference System (or CRS) defining the meaning (and possibly units) of the values in
the Coordinate.

For most use cases, it is not necessary to specify the CRS explicitly as it will be deduced from the keys
used to specify the coordinate. Two rules are applied to deduce the CRS from the coordinate:

• Choice of keys:
◦ If the coordinate is specified using the keys latitude and longitude the CRS will be assumed to be
Geographic and therefor either WGS-84 or WGS-84-3D.
◦ If instead x and y are used, then the default CRS would be Cartesian or Cartesian-3D.

• Number of dimensions:
◦ If there are 2 dimensions in the coordinate, x & y or longitude & latitude the CRS will be a 2D
CRS.
◦ If there is a third dimensions in the coordinate, z or height the CRS will be a 3D CRS.

All fields are provided to the point function in the form of a map of explicitly named arguments. Neo4j
does not support an ordered list of coordinate fields because of the contradictory conventions between
geographic and cartesian coordinates, where geographic coordinates normally list y before x (latitude
before longitude).

The following query which returns points created in each of the four supported CRSs. Take particular note
of the order and keys of the coordinates in the original point function, and how those values are displayed
in the results:

326
Query

RETURN
point({x: 3, y: 0}) AS cartesian_2d,
point({x: 0, y: 4, z: 1}) AS cartesian_3d,
point({latitude: 12, longitude: 56}) AS geo_2d,
point({latitude: 12, longitude: 56, height: 1000}) AS geo_3d

Result

cartesian_2d cartesian_3d geo_2d geo_3d

point({srid:7203, x: 3.0, point({srid:9157, x: 0.0, point({srid:4326, x: 56.0, point({rid:4979, x: 56.0,


y: 0.0}) y: 4.0, z: 1.0}) y: 12.0}) y: 12.0, z: 1000.0})

Rows: 1

For the geographic coordinates, it is important to note that the latitude value should always lie in the
interval [-90, 90]. Any other value outside this range will throw an exception. The longitude value should
always lie in the interval [-180, 180]. Any other value outside this range will be wrapped around to fit in
this range. The height value and any Cartesian coordinates are not explicitly restricted. Any value within
the allowed range of the signed 64-bit floating point type will be accepted.

Components of points
Components of POINT values can be accessed as properties.

Components of POINT instances and where they are supported

Component Description Type Range/Forma WGS-84 WGS-84-3D Cartesian Cartesian-3D


t

instant.x The first FLOAT Number    


element of the literal, range
Coordinate depends on
CRS

instant.y The second FLOAT Number    


element of the literal, range
Coordinate depends on
CRS

instant.z The third FLOAT Number  


element of the literal, range
Coordinate depends on
CRS

instant.long The first FLOAT Number  


itude element of the literal, -180.0
Coordinate for to 180.0
geographic
CRSs,
degrees East
of the prime
meridian

327
Component Description Type Range/Forma WGS-84 WGS-84-3D Cartesian Cartesian-3D
t

instant.lati The second FLOAT Number  


tude element of the literal, -90.0
Coordinate for to 90.0
geographic
CRS, degrees
North of the
equator

instant.heig The third FLOAT Number 


ht element of the literal, range
Coordinate for limited only
geographic by the
CRSs, meters underlying
above the 64-bit floating
ellipsoid point type
defined by the
datum (WGS-
84)

instant.crs The name of STRING One of wgs-    


the CRS 84, wgs-84-3d,
cartesian,
cartesian-3d

instant.srid The internal INTEGER One of 4326,    


Neo4j ID for 4979, 7203,
the CRS 9157

Examples
The following query shows how to extract the components of a Cartesian 2D POINT value:

Query

WITH point({x: 3, y: 4}) AS p


RETURN
p.x AS x,
p.y AS y,
p.crs AS crs,
p.srid AS srid

Result

x y crs srid

3.0 4.0 "cartesian" 7203

Rows: 1

The following query shows how to extract the components of a WGS-84 3D POINT value:

328
Query

WITH point({latitude: 3, longitude: 4, height: 4321}) AS p


RETURN
p.latitude AS latitude,
p.longitude AS longitude,
p.height AS height,
p.x AS x,
p.y AS y,
p.z AS z,
p.crs AS crs,
p.srid AS srid

Result

latitude longitude height x y z crs srid

3.0 4.0 4321.0 4.0 3.0 4321.0 "wgs-84-3d" 4979

Rows: 1

Spatial values and indexes


If there is a range or point index on a particular node or relationship property, and a spatial point is
assigned to that property on a node or relationship, the node or relationship will be indexed.

In a point index, Neo4j uses space filling curves in 2D or 3D over an underlying generalized B+Tree. Point
indexes are optimized for distance and bounding box queries. For more information, see Managing indexes
→ Point indexes.

In a range index, the points will be sorted according to their lexicographic ordering per coordinate
reference system. For point values, this index has support for equality checks. For more information, see
Managing indexes → Range indexes.

Comparability and orderability


Cypher does not support comparing spatial values using the inequality operators, <, <=, >, and >=.
Attempting to do so will return null.

To compare spatial points within a specific range, instead use the spatial functions point.distance or
point.withinBBox.

Working with null


In Cypher, null is used to represent missing or undefined values. All data types in Cypher are nullable. This
means that type predicate expressions always return true for null values.

Conceptually, null means a missing or unknown value, and it is treated somewhat differently from other
values. For example, returning a property from a node that does not have said property produces null.
Most expressions that take null as input will produce null. In the case of a predicate used in a WHERE
clause, anything that is not true is interpreted as being false.

null is not equal to null. Not knowing two values does not imply that they are the same value. This means

329
that the expression null = null yields null, and not true.

Logical operations with null


The logical operators (AND, OR, XOR, NOT) treat null as the unknown value of three-valued logic.

Truth table for logical operators

a b a AND b a OR b a XOR b NOT a

false false false false false true

false null false null null true

false true false true true true

true false false true true false

true null null true null false

true true true true false false

null false false null null null

null null null null null null

null true null true null null

The IN operator and null


The IN operator follows similar logic. If Cypher can ascertain that something exists in a list, the result will
be true. Any list that contains a null and does not have a matching element will return null. Otherwise,
the result will be false.

examples of expressions containing the IN operator

Expression Result

2 IN [1, 2, 3] true

2 IN [1, null, 3] null

2 IN [1, 2, null] true

2 IN [1] false

2 IN [] false

null IN [1, 2, 3] null

null IN [1, null, 3] null

null IN [] false

Using all, any, none, and single follows a similar rule. If the result can be calculated definitively, true or
false is returned. Otherwise null is produced.

The [] operator and null


Accessing a list or a map with null will result in null:

330
Expression Result

[1, 2, 3][null] null

[1, 2, 3, 4][null..2] null

[1, 2, 3][1..null] null

{age: 25}[null] null

Using parameters to pass in the bounds, such as a[$lower..$upper], may result in a null for the lower or
upper bound (or both). The following workaround will prevent this from happening by setting the absolute
minimum and maximum bound values:

a[coalesce($lower,0)..coalesce($upper,size(a))]

Expressions that return null


• Getting a missing element from a list: [][0], head([]).

• Trying to access a property that does not exist on a node or relationship: n.missingProperty.

• Comparisons when either side is null: 1 < null.

• Arithmetic expressions containing null: 1 + null.

• Some function calls where any argument is null: e.g., sin(null).

Using IS NULL and IS NOT NULL


Testing any value against null, either with the = operator or with the <> operator, always evaluates to
null. Therefore, use the special equality operators IS NULL or IS NOT NULL.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/cypher-
aggregation/ad.adoc

Lists
Cypher includes comprehensive support for lists. This section first describes lists in general, and then
discusses how to use list comprehension and pattern comprehension in lists.

Information regarding operators, such as list concatenation (+), element existence

 checking (IN), and access ([]) can be found here. The behavior of the IN and []
operators with respect to null is detailed here.

Lists in general
A literal list is created by using brackets and separating the elements in the list with commas.

Query

RETURN [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] AS list

331
Result

list

[0,1,2,3,4,5,6,7,8,9]

Rows: 1

A list can consist of different value types.

Query

RETURN [0, "hello", 3.14, null] AS list

Result

list

[0, "hello", 3.14, null]

Rows: 1

Lists are indexed by 0 in Cypher. To access individual elements in a list, use square brackets. This extracts
from the start index and up to, but not including, the end index.

For example:

Query

WITH [5,1,7] AS list


RETURN list[2]

Result

list[2]

Rows: 1

List range and size


The below examples use the range function to create lists. This function returns a list containing all
numbers between given start and end numbers. The range is inclusive in both ends.

Query

RETURN range(0, 10)[3] AS element

Result

element

Rows: 1

332
It is also possible to use negative numbers, to start from the end of the list instead.

Query

RETURN range(0, 10)[-3] AS element

Result

element

Rows: 1

Finally, it is possible to use ranges inside the brackets to return ranges of the list. The list range operator
([]) is inclusive of the first value, but exclusive of the last value.

Query

RETURN range(0, 10)[0..3] AS list

Result

list

[0,1,2]

Rows: 1

Query

RETURN range(0, 10)[0..-5] AS list

Result

list

[0,1,2,3,4,5]

Rows: 1

Query

RETURN range(0, 10)[-5..] AS list

Result

list

[6,7,8,9,10]

Rows: 1

Query

RETURN range(0, 10)[..4] AS list

333
Result

list

[0,1,2,3]

Rows: 1

Out-of-bound slices are simply truncated, but out-of-bound single elements return null.

Query

RETURN range(0, 10)[15] AS list

Result

list

<null>

Rows: 1

Query

RETURN range(0, 10)[5..15] AS list

Result

list

[5,6,7,8,9,10]

Rows: 1

The size of a list can be obtained as follows:

Query

RETURN size(range(0, 10)[0..3]) AS list

Result

list

Rows: 1

Pattern comprehension
Pattern comprehension is a syntactic construct available in Cypher for creating a list based on matchings
of a pattern. A pattern comprehension matches the specified pattern like a normal MATCH clause, with
predicates like a normal WHERE clause, but yields a custom projection as specified.

334
Example graph
The following graph is used for examples below:

name: 'Keanu Reeves'

Movie ACTED_IN Person ACTED_IN Movie

title: 'Johnny Mnemonic' IN AC title: 'The Matrix Resurrections'


D_ T ED
released: 1995 TE _IN released: 2021
AC

_IN

AC
ACTED_IN

T
ED

ED
T

_I
AC
Movie

N
Movie
title: 'The Devil's Advocate' title: 'The Matrix Revolutions'
released: 1997 released: 2003

Movie Movie

title: 'The Matrix' Movie title: 'The Matrix Reloaded'


released: 1999 released: 2003

title: 'The Replacements'


released: 2000

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(keanu:Person {name: 'Keanu Reeves'}),
(johnnyMnemonic:Movie {title: 'Johnny Mnemonic', released: 1995}),
(theMatrixRevolutions:Movie {title: 'The Matrix Revolutions', released: 2003}),
(theMatrixReloaded:Movie {title: 'The Matrix Reloaded', released: 2003}),
(theReplacements:Movie {title: 'The Replacements', released: 2000}),
(theMatrix:Movie {title: 'The Matrix', released: 1999}),
(theDevilsAdvocate:Movie {title: 'The Devils Advocate', released: 1997}),
(theMatrixResurrections:Movie {title: 'The Matrix Resurrections', released: 2021}),
(keanu)-[:ACTED_IN]->(johnnyMnemonic),
(keanu)-[:ACTED_IN]->(theMatrixRevolutions),
(keanu)-[:ACTED_IN]->(theMatrixReloaded),
(keanu)-[:ACTED_IN]->(theReplacements),
(keanu)-[:ACTED_IN]->(theMatrix),
(keanu)-[:ACTED_IN]->(theDevilsAdvocate),
(keanu)-[:ACTED_IN]->(theMatrixResurrections)

Examples
This example returns a list that contains the year when the movies were released. The pattern matching in
the pattern comprehension looks for Matrix in the movie title and that the node keanu (Person node with
the name Keanu Reeves) has a relationship with the movie.

Query

MATCH (keanu:Person {name: 'Keanu Reeves'})


RETURN [(keanu)-->(b:Movie) WHERE b.title CONTAINS 'Matrix' | b.released] AS years

Result

years

[2021,2003,2003,1999]

Rows: 1

335
The whole predicate, including the WHERE keyword, is optional and may be omitted.

Storing lists as properties

It is possible to store homogenous lists of simple values as properties. For example, the following query
creates a list from the title properties of the Movie nodes connected to Keanu Reeves. It then sets that list
as a resume property on Keanu Reeves.

Query

MATCH (keanu:Person {name: 'Keanu Reeves'})


WITH keanu,[(keanu)-->(b:Movie) | b.title] AS movieTitles
SET keanu.resume = movieTitles
RETURN keanu.resume

Result

keanu.resume

["The Matrix Resurrections", "The Devils Advocate", "The Matrix", "The Replacements", "The Matrix Reloaded",
"The Matrix Revolutions", "Johnny Mnemonic"]

Rows: 1

It is not, however, possible to store heterogeneous lists as properties. For example, the following query,
which tries to set a list including both the title and the released properties as the resume property of
Keanu Reeves will fail. This is because the title property values are stored as STRING values, while the
released property values are stored as INTEGER values.

Query

MATCH (keanu:Person {name: 'Keanu Reeves'})


WITH keanu,[(keanu)-->(b:Movie) | b.title] + [(keanu)-->(b:Movie) | b.released] AS movieTitles
SET keanu.resume = movieTitles
RETURN keanu.resume

Neo4j only supports a subset of Cypher types for storage as singleton or array properties. Please refer to
section cypher/syntax/values of the manual for more details.

List comprehension
List comprehension is a syntactic construct available in Cypher for creating a list based on existing lists.

For example, the following query returns a new list from the previously created resume property (a list of
strings) of Keanu Reeves:

Query

MATCH (keanu:Person {name:'Keanu Reeves'})


RETURN [x IN keanu.resume WHERE x contains 'The Matrix'] AS matrixList

Result

336
matrixList

["The Matrix Resurrections", "The Matrix", "The Matrix Reloaded", "The Matrix Revolutions"]

Rows: 1

List comprehension follows the form of the mathematical set-builder notation (set comprehension) instead
of the use of map and filter functions.

Query

RETURN [x IN range(0,10) WHERE x % 2 = 0 | x^3 ] AS result

Result

result

[0.0,8.0,64.0,216.0,512.0,1000.0]

Rows: 1

Either the WHERE part, or the expression, can be omitted, if you only want to filter or map respectively.

Query

RETURN [x IN range(0,10) WHERE x % 2 = 0 ] AS result

Result

result

[0,2,4,6,8,10]

Rows: 1

Query

RETURN [x IN range(0,10) | x^3 ] AS result

Result

result

[0.0,1.0,8.0,27.0,64.0,125.0,216.0,343.0,512.0,729.0,1000.0]

Rows: 1

Maps
Cypher supports the construction of maps. This section first discusses literal maps and then moves on to
map projection.

Information regarding property access operators such as . and [] can be found here.
 The behavior of the [] operator with respect to null is detailed here.

337
Literal maps
The key names in a map must be literals. If returned through an HTTP API call, a JSON object will be
returned. If returned in Java, an object of type java.util.Map<String,Object> will be returned.

Query

RETURN {key: 'Value', listKey: [{inner: 'Map1'}, {inner: 'Map2'}]} AS map

Result

map

{'listKey': [{'inner': 'Map1'}, {'inner': 'Map2'}], 'key': 'Value'}

Rows: 1

Map projection
Cypher supports map projections, which allows for the construction of map projections from nodes,
relationships, and other map values.

A map projection begins with the variable bound to the graph entity to be projected from, and contains a
body of comma-separated map elements, enclosed by { and }.

Map projection

map_variable {map_element, [, ...n]}

A map element projects one or more key-value pairs to the map projection. There exist four different types
of map projection elements:

• Property selector - Projects the property name as the key, and the value from the map_variable as the
value for the projection.

• Literal entry - This is a key-value pair, with the value being an arbitrary expression key: <expression>.

• Variable selector - Projects a variable, with the variable name as the key, and the value the variable is
pointing to as the value of the projection. Its syntax is just the variable.

• All-properties selector - projects all key-value pairs from the map_variable value.

The following conditions apply:

• If the map_variable points to a null value, the whole map projection will evaluate to null.

• The key names in a map must be of type STRING.

Example graph
The following graph is used for the examples below:

338
title: 'The Matrix'
released: 1999 Movie

AC
N
_I

TE
D
TE

D
_I
AC

N
Movie
name: 'Keanu Reeves'
nationality: 'Canadian' _IN ACT name: 'Carrie-Anne Moss'
ED E D_IN
ACT title: 'The Matrix Revolutions'
released: 2003
Person Person
ACTE N
D _IN ED_I
ACT

Movie
ACTED_IN

N
AC

_I
title: 'The Matrix Reloaded'
TE

D
TE
D

released: 2003
_I

AC
N

Movie Movie
title: 'The Matrix Resurrections'
title: 'The Devil's Advocate' released: 2021
released: 1997

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(keanu:Person {name: 'Keanu Reeves', nationality: 'Canadian'}),
(carrieAnne:Person {name: 'Carrie-Anne Moss'}),
(theMatrixRevolutions:Movie {title: 'The Matrix Revolutions', released: 2003}),
(theMatrixReloaded:Movie {title: 'The Matrix Reloaded', released: 2003}),
(theMatrix:Movie {title: 'The Matrix', released: 1999}),
(theDevilsAdvocate:Movie {title: 'The Devils Advocate', released: 1997}),
(theMatrixResurrections:Movie {title: 'The Matrix Resurrections', released: 2021}),
(keanu)-[:ACTED_IN]->(theMatrix),
(keanu)-[:ACTED_IN]->(theMatrixRevolutions),
(keanu)-[:ACTED_IN]->(theMatrixReloaded),
(keanu)-[:ACTED_IN]->(theMatrixResurrections),
(keanu)-[:ACTED_IN]->(theDevilsAdvocate),
(carrieAnne)-[:ACTED_IN]->(theMatrix),
(carrieAnne)-[:ACTED_IN]->(theMatrixRevolutions),
(carrieAnne)-[:ACTED_IN]->(theMatrixReloaded),
(carrieAnne)-[:ACTED_IN]->(theMatrixResurrections)

Examples
The below query finds the Keanu Reeves node and the movies he has acted in. It is an example of a map
projection with a literal entry, which in turn also uses map projection inside the aggregating collect()
function.

Query

MATCH (keanu:Person {name: 'Keanu Reeves'})-[:ACTED_IN]->(movie:Movie)


WITH keanu, collect(movie{.title, .released}) AS movies
RETURN keanu{.name, movies: movies}

339
Result

keanu

{movies: [{title: "The Devils Advocate", released: 1997}, {title: "The Matrix Revolutions", released: 2003},
{title: "The Matrix Resurrections", released: 2021}, {title: "The Matrix Reloaded", released: 2003}, {title:
"The Matrix", released: 1999}], name: "Keanu Reeves"}

Rows: 1

The below query finds all Person nodes in the graph that have one or more relationships with the type
ACTED_IN connected to Movie nodes. It uses the count() function to count how many Movie nodes are
connected to each Person node in this way, and uses a variable selector to project the value of the count.

Query

MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie)
WITH actor, count(movie) AS numberOfMovies
RETURN actor{.name, numberOfMovies}

Result

actor

{numberOfMovies: 5, name: "Keanu Reeves"}

{numberOfMovies: 4, name: "Carrie-Anne Moss"}

Rows: 2

The below query returns all properties from the Keanu Reeves node. An all-properties selector is used to
project all the node properties, and additionally, explicitly project the property age. Since this property does
not exist on the node Keanu Reeves, a null value is projected instead.

Query

MATCH (keanu:Person {name: 'Keanu Reeves'})


RETURN keanu{.*, .age}

Result

keanu

{nationality: "Canadian", name: "Keanu Reeves", age: null}

Rows: 1

The below query is an example of statically accessing individual map members using the . operator:

Query

WITH {age: 58, profession: 'Actor'} as keanuStats


RETURN keanuStats.profession AS profession

Result

profession

"Actor"

340
profession

Rows: 1

Casting data values


Cypher supports a number of functions to cast values to different data types. This section will provide an
overview of those functions, as well examples of how to use them in practice.

Functions for converting data values


The following functions are available for casting data values:

Function Description

toBoolean() Converts a STRING, INTEGER, or BOOLEAN value to a BOOLEAN value.

toBooleanList() Converts a LIST<ANY> and returns a LIST<BOOLEAN> values. If any


values are not convertible to BOOLEAN they will be null in the
LIST<BOOLEAN> returned.

toBooleanOrNull() Converts a STRING, INTEGER or BOOLEAN value to a BOOLEAN value. For


any other input value, null will be returned.

toFloat() Converts an INTEGER, FLOAT, or a STRING value to a FLOAT value.


Otherwise null is returned.

toFloatList() Converts a LIST<ANY> and returns a LIST<FLOAT> values. If any values


are not convertible to FLOAT they will be null in the LIST<FLOAT>
returned.

toFloatOrNull() Converts an INTEGER, FLOAT, or a STRING value to a FLOAT. For any


other input value, null will be returned.

toInteger() Converts a BOOLEAN, INTEGER, FLOAT or a STRING value to an INTEGER


value.

toIntegerList() Converts a LIST<ANY> to a LIST<INTEGER> values. If any values are not


convertible to INTEGER they will be null in the LIST<INTEGER> returned.

toIntegerOrNull() Converts a BOOLEAN, INTEGER, FLOAT or a STRING value to an INTEGER


value. For any other input value, null will be returned.

341
Function Description

toString() Converts an INTEGER, FLOAT, BOOLEAN, STRING, POINT, DURATION, DATE,


ZONED TIME, LOCAL TIME, LOCAL DATETIME, or ZONED DATETIME value to
a STRING value.

toStringList() Converts a LIST<ANY> and returns a LIST<STRING> values. If any


values are not convertible to STRING they will be null in the
LIST<STRING> returned.

toStringOrNull() Converts an INTEGER, FLOAT, BOOLEAN, STRING, POINT, DURATION, DATE,


ZONED TIME, LOCAL TIME, LOCAL DATETIME, or ZONED DATETIME value to
a STRING. For any other input value, null will be returned.

More information about these, and many other functions, can be found in the section on Functions.

Examples
The following graph is used for the examples below:

name: 'Keanu Reeves' name: 'Carrie-Anne Moss'


age: 58 age: 55
active: true active: true

KNOWS
Person since: 1999
Person

To recreate it, run the following query against an empty Neo4j database:

CREATE (keanu:Person {name:'Keanu Reeves', age: 58, active:true}),


(carrieAnne:Person {name:'Carrie-Anne Moss', age: 55, active:true}),
(keanu)-[r:KNOWS {since:1999}]->(carrieAnne)

Returning converted values


In the below query, the function toFloat is used to cast two STRING values. It shows that null is returned if
the data casting is not possible.

MATCH (keanu:Person {name:'Keanu Reeves'})


RETURN toFloat(keanu.age), toInteger(keanu.name)

Result

toFloat(keanu.age) toInteger(keanu.name)

58.0 null

342
If the function toFloat is passed an unsupported value (such as a DATE value), it will throw an error:

Query

WITH date({
year: 2023, month: 5, day: 2
}) AS d
RETURN toFloat(d)

Error message

Type mismatch: expected Float, Integer, Number or String but was Date (line 4, column 16 (offset: 66))
"RETURN toFloat(d)"

However, if the same value is passed to the function toFloatOrNull, null will be returned.

Query

WITH date({
year: 2023, month: 5, day: 2
}) AS d
RETURN toFloatOrNull(d)

Result

toFloatOrNull(d)

null

It is also possible to return casted values as a list. The below query uses the toStringList to cast all
passed values into STRING values, and return them in as a LIST<STRING>:

MATCH (keanu:Person {name:'Keanu Reeves'})


RETURN toStringList([keanu.name, keanu.age]) AS keanuList

Result

keanuList

["Keanu Reeves", "58"]

Updating property value types


The functions to cast data values can be used to update property values on nodes and relationships. The
below query casts the age (INTEGER), active (BOOLEAN), and since(INTEGER) properties to STRING values:

MATCH (keanu:Person {name:'Keanu Reeves'})-[r:KNOWS]-()


SET keanu.age = toString(keanu.age),
keanu.active = toString(keanu.active),
r.since = toString(r.since)
RETURN keanu.age, keanu.active, r.since

Result

343
keanu.age keanu.active r.since

"58" "true" "1999"

Type predicate expressions


A type predicate expression can be used to verify the type of a variable, literal, property or other Cypher
expression.

Syntax

<expr> IS :: <TYPE>

Where <expr> is any Cypher expression and <TYPE> is a Cypher type. For all available Cypher types, see
the section on types and their synonyms.

Verify the type of a Cypher expression

UNWIND [42, true, 'abc', null] AS val


RETURN val, val IS :: INTEGER AS isInteger

val isInteger

42 true

true false

'abc' false

null true

Rows: 4

Type predicate expressions with NOT


It is also possible to verify that a Cypher expression is not of a certain type, using the negated type
predicate expression IS NOT ::.

UNWIND [42, true, 'abc', null] AS val


RETURN val, val IS NOT :: STRING AS notString

val notString

42 true

true true

'abc' false

null false

Rows: 4

344
Type predicate expressions for null Label—new 5.10
All Cypher types includes the null value. Since Neo4j 5.10, type predicate expressions can be appended
with NOT NULL. This means that IS :: returns true for all expressions evaluating to null, unless NOT NULL is
appended.

RETURN
null IS :: BOOLEAN AS isBoolean,
null IS :: BOOLEAN NOT NULL AS isNotNullBoolean

isBoolean isNotNullBoolean

true false

Rows: 1

Likewise, IS NOT :: returns false for all expressions evaluating to null, unless the type is appended with
NOT NULL.

RETURN
(null + 1) IS NOT :: DATE AS isNotDate,
(null + 1) IS NOT :: DATE NOT NULL AS isNotNotNullDate

isNotDate isNotNotNullDate

false true

Rows: 1

It is also possible to check whether a value is the only null value using the NULL type.

RETURN null IS :: NULL AS isNull

isNull

true

Rows: 1

Closed dynamic union types (INNER_TYPE_1 | INNER_TYPE_2…) cannot be declared as NOT NULL. Instead, all
the inner types should be individually declared as not nullable to achieve this behavior.

Note that all inner types in a closed dynamic union must be either nullable, or not nullable. This is because
null values cannot be attributed to a specific type. A syntax error will be raised if the inner types are not of
the same nullability.

RETURN 1 IS :: INTEGER NOT NULL | FLOAT

Error message

All types in a Closed Dynamic Union must be nullable, or be appended with `NOT NULL`.

345
Type predicate expression for properties
Type predicate expressions can also be used to filter out nodes or relationships with properties of a certain
type.

A graph containing the following nodes is used for the example below:

Person Person Person

name: 'Alice' name: 'Bob' name: 'Charlie'


age: 18 age: '20' age: 21

The following query finds all Person nodes with an age property that is an INTEGER with a value greater
than 18.

MATCH (n:Person)
WHERE n.age IS :: INTEGER AND n.age > 18
RETURN n.name AS name, n.age AS age

name age

'Charlie' 21

Rows: 1

The type PROPERTY VALUE can also be used to check whether a type is storable as a property. Types not
storable in properties, such as MAP, will return false when checked with IS :: PROPERTY VALUE.

Type predicate expressions for numbers of different sizes


For numerical values passed in as parameters, Cypher does not take the size of the number into account.
Cypher will therefore regard any exact numerical parameter as an INTEGER regardless of its declared size.
For example, an INT16 or an INT32 passed through from a language library will both be treated by Cypher
as an INTEGER. Note that any exact numerical parameter used must fit within the range of an INT64.

RETURN $int16param IS :: INTEGER AS isInteger

isInteger

true

Rows: 1

More information about parameters can be found here.

346
Syntactical variations of type predicate expressions
Type predicate expressions allow for some alternative syntax:

<expr> IS TYPED <TYPE>

<expr> :: <TYPE>

For verifying that an expression is not of a certain type, the following alternative syntax is supported:

<expr> IS NOT TYPED <TYPE>

Use of ANY and NOTHING types Label—new 5.10


ANY is a supertype which matches values of all types. NOTHING is a type containing an empty set of values.
This means that it returns false for all values.

RETURN 42 IS :: ANY AS isOfTypeAny, 42 IS :: NOTHING AS isOfTypeNothing

isOfTypeAny isOfTypeNothing

true false

Rows: 1

Closed Dynamic Unions Label—new 5.11


Closed dynamic union types allow for the testing of multiple types in the same predicate.

UNWIND [42, 42.0, "42"] as val


RETURN val, val IS :: INTEGER | FLOAT AS isNumber

val isNumber

42 true

42.0 true

"42" false

Rows: 3

List Types Label—new 5.10


Type predicate expressions can be used for LIST types, where the inner type of the elements in the list
must be specified. If the inner type is not relevant, then the ANY type may be used.

For a LIST type check to return true, all values in the list must match the inner type.

347
UNWIND [[42], [42, null], [42, 42.0]] as val
RETURN val, val IS :: LIST<INTEGER> AS isIntList

val isIntList

[42] true

[42, null] true

[42, 42.0] false

Rows: 3

An empty list will match on all inner types, even the NOTHING type.

RETURN
[] IS :: LIST<NOTHING> AS isNothingList,
[] IS :: LIST<INTEGER> AS isIntList,
[] IS :: LIST<FLOAT NOT NULL> AS isFloatNotNullList

isNothingList isIntList isFloatNotNullList

true true true

Rows: 1

Lists can be combined with closed dynamic union types to create tests for heterogeneous lists.

WITH [1, 0, true, false] AS booleanList


RETURN booleanList IS :: LIST<BOOLEAN | INTEGER> as isMixedList

isMixedList

true

Rows: 1

[1] The minimum value represents the minimum positive value of a FLOAT, i.e. the closest value to zero. It is also possible to
have a negative float.

[2] This is in accordance with the Gregorian calendar; i.e. years AD/CE start at year 1, and the year before that (year 1
BC/BCE) is 0, while year 2 BCE is -1 etc.

[3] The first week of any year is the week that contains the first Thursday of the year, and thus always contains January 4.

[4] For dates from December 29, this could be the next year, and for dates until January 3 this could be the previous year,
depending on how week 1 begins.

[5] Cypher does not support leap seconds; UTC-SLS (UTC with Smoothed Leap Seconds) is used to manage the difference
in time between UTC and TAI (International Atomic Time).

[6] The expression datetime().epochMillis returns the equivalent value of the timestamp() function.

[7] For the nanosecond part of the epoch offset, the regular nanosecond component (instant.nanosecond) can be used.

348
Functions
This section contains a summary of all functions in Cypher.

To list all functions, run the following query:

List all functions

SHOW FUNCTIONS

For more information about this command, see SHOW FUNCTIONS.

Functions taking a STRING as input all operate on Unicode characters rather than on a

 standard char[]. For example, the size() function applied to any Unicode character will
return 1, even if the character does not fit in the 16 bits of one char.

Aggregating functions
These functions take multiple values as arguments, and calculate and return an aggregated value from
them.

Function Signature Description

avg() avg(input :: INTEGER | FLOAT | Returns the average of a set of INTEGER,


DURATION) :: INTEGER | FLOAT | FLOAT, or DURATION values.
DURATION

collect() collect(input :: ANY) :: LIST<ANY> Returns a list containing the values


returned by an expression.

count() count(input :: ANY) :: INTEGER Returns the number of values or rows.

max() max(input :: ANY) :: ANY Returns the maximum value in a set of


values.

min() min(input :: ANY) :: ANY Returns the minimum value in a set of


values.

percentileCont() percentileCont(input :: FLOAT, Returns the percentile of a value over a


percentile :: FLOAT) :: FLOAT group using linear interpolation.

percentileDisc() percentileDisc(input :: INTEGER | Returns the nearest INTEGER or FLOAT


FLOAT, percentile :: FLOAT) :: FLOAT value to the given percentile over a
group using a rounding method.

stDev() stDev(input :: FLOAT) :: FLOAT Returns the standard deviation for the
given value over a group for a sample of
a population.

stDevP() stDevP(input :: FLOAT) :: FLOAT Returns the standard deviation for the
given value over a group for an entire
population.

sum() sum(input :: INTEGER | FLOAT | Returns the sum of a set of INTEGER,


DURATION) :: INTEGER | FLOAT | FLOAT, or DURATION values.
DURATION

349
Database functions Label—new 5.12
Database functions provide information about databases.

Function Signature Description

db.nameFromElementId() db.nameFromElementId(elementId :: Resolves the database name from the


STRING) :: STRING given element id.

GenAI functions Label—new 5.12


Function Signature Description

genai.vector.encode() genai.vector.encode(resource :: Encode a given resource as a vector


STRING, provider :: STRING, using the named provider.
configuration :: MAP = {}) ::
LIST<FLOAT>

Graph functions
Graph functions provide information about the constituent graphs in composite databases.

Function Signature Description

graph.byElementId() USE graph.byElementId(elementId :: Resolves the constituent graph to which


STRING) a given element id belongs. New

graph.byName() USE graph.byName(name :: STRING) Resolves a constituent graph by name.

graph.names() graph.names() :: LIST<STRING> Returns a list containing the names of all


graphs in the current composite
database.

graph.propertiesByName() graph.propertiesByName(name :: Returns a map containing the properties


STRING) :: MAP associated with the given graph.

List functions
These functions return lists of other values. Further details and examples of lists may be found in Lists.

Function Signature Description

keys() keys(input :: NODE | RELATIONSHIP | Returns a LIST<STRING> containing the


MAP) :: LIST<STRING> STRING representations for all the
property names of a MAP, NODE, or
RELATIONSHIP.

labels() labels(input :: NODE) :: Returns a LIST<STRING> containing the


LIST<STRING> STRING representations for all the labels
of a NODE.

nodes() nodes(input :: PATH) :: LIST<NODE> Returns a LIST<NODE> containing all the


NODE values in a PATH.

350
Function Signature Description

range() range(start :: INTEGER, end :: Returns a LIST<INTEGER> comprising all


INTEGER [, step :: INTEGER]) :: INTEGER values within a specified range,
LIST<INTEGER>
optionally specifying a step length.

reduce() reduce(accumulator :: VARIABLE = Runs an expression against individual


initial :: ANY, variable :: VARIABLE elements of a LIST<ANY>, storing the
IN list :: LIST<ANY> expression ::
ANY) :: ANY result of the expression in an
accumulator.

relationships() relationships(input :: PATH) :: Returns a LIST<RELATIONSHIP>


LIST<RELATIONSHIP> containing all the RELATIONSHIP values in
a PATH.

reverse() reverse(input :: LIST<ANY>) :: Returns a LIST<ANY> in which the order


LIST<ANY> of all elements in the given LIST<ANY>
have been reversed.

tail() tail(input :: LIST<ANY>) :: Returns all but the first element in a


LIST<ANY> LIST<ANY>.

toBooleanList() toBooleanList(input :: LIST<ANY>) :: Converts a LIST<ANY> of values to


LIST<BOOLEAN>
a LIST<BOOLEAN> values. If any
values are not convertible to
BOOLEAN they will be null in the
LIST<BOOLEAN> returned.

toFloatList() toFloatList(input :: LIST<ANY>) :: Converts a LIST<ANY> to a


LIST<FLOAT>
LIST<FLOAT> values. If any values
are not convertible to FLOAT they
will be null in the LIST<FLOAT>
returned.

toIntegerList() toIntegerList(input :: LIST<ANY>) :: Converts a LIST<ANY> to a


LIST<INTEGER>
LIST<INTEGER> values. If any
values are not convertible to
INTEGER they will be null in the
LIST<INTEGER> returned.

toStringList() toStringList(input :: LIST<ANY>) :: Converts a LIST<ANY> to a


LIST<STRING>
LIST<STRING> values. If any values
are not convertible to STRING they
will be null in the LIST<STRING>
returned.

LOAD CSV functions


LOAD CSV functions can be used to get information about the file that is processed by LOAD CSV.

351
Function Signature Description

file() file() :: STRING Returns the absolute path of the file that
LOAD CSV is using.

linenumber() linenumber() :: INTEGER Returns the line number that LOAD CSV
is currently using.

Logarithmic functions
These functions all operate on numerical expressions only, and will return an error if used on any other
values.

Function Signature Description

e() e() :: FLOAT Returns the base of the natural


logarithm, e.

n
exp() exp(input :: FLOAT) :: FLOAT Returns e , where e is the base of the
natural logarithm, and n is the value of
the argument expression.

log() log(input :: FLOAT) :: FLOAT Returns the natural logarithm of a FLOAT.

log10() log10(input :: FLOAT) :: FLOAT Returns the common logarithm (base


10) of a FLOAT.

sqrt() sqrt(input :: FLOAT) :: FLOAT Returns the square root of a FLOAT.

Numeric functions
These functions all operate on numerical expressions only, and will return an error if used on any other
values.

Function Signature Description

abs() abs(input :: INTEGER | FLOAT) :: Returns the absolute value of an


INTEGER | FLOAT INTEGER or FLOAT.

ceil() ceil(input :: FLOAT) :: FLOAT Returns the smallest FLOAT that is


greater than or equal to a number and
equal to an INTEGER.

floor() floor(input :: FLOAT) :: FLOAT Returns the largest FLOAT that is less
than or equal to a number and equal to
an INTEGER.

isNaN() isNaN(input :: INTEGER | FLOAT) :: Returns true if the floating point


BOOLEAN number is NaN.

rand() rand() :: FLOAT Returns a random FLOAT in the range


from 0 (inclusive) to 1 (exclusive).

352
Function Signature Description

round() round(input :: FLOAT [, precision :: Returns the value of a number rounded


INTEGER | FLOAT, mode :: STRING]) :: to the nearest INTEGER, optionally using
FLOAT
a specified precision and rounding
mode.

sign() sign(input :: INTEGER | FLOAT) :: Returns the signum of an INTEGER or


INTEGER FLOAT: 0 if the number is 0, -1 for any
negative number, and 1 for any positive
number.

Trigonometric functions
These functions all operate on numerical expressions only, and will return an error if used on any other
values.

All trigonometric functions operate on radians, unless otherwise specified.

Function Signature Description

acos() acos(input :: FLOAT) :: FLOAT Returns the arccosine of a FLOAT in


radians.

asin() asin(input :: FLOAT) :: FLOAT Returns the arcsine of a FLOAT in


radians.

atan() atan(input :: FLOAT) :: FLOAT Returns the arctangent of a FLOAT in


radians.

atan2() atan2(y :: FLOAT, x :: FLOAT) :: Returns the arctangent2 of a set of


FLOAT coordinates in radians.

cos() cos(input :: FLOAT) :: FLOAT Returns the cosine of a FLOAT.

cot() cot(input :: FLOAT) :: FLOAT Returns the cotangent of a FLOAT.

degrees() degrees(input :: FLOAT) :: FLOAT Converts radians to degrees.

haversin() haversin(input :: FLOAT) :: FLOAT Returns half the versine of a number.

pi() pi() :: FLOAT Returns the mathematical constant pi.

radians() radians(input :: FLOAT) :: FLOAT Converts degrees to radians.

sin() sin(input :: FLOAT) :: FLOAT Returns the sine of a FLOAT.

tan() tan(input :: FLOAT) :: FLOAT Returns the tangent of a FLOAT.

Predicate functions
These functions return either true or false for the given arguments.

353
Function Signature Description

all() all(variable :: ANY, list :: Returns true if the predicate holds for all
LIST<ANY>, predicate :: ANY) :: elements in the given LIST<ANY>.
BOOLEAN

any() any(variable :: ANY, list :: Returns true if the predicate holds for at
LIST<ANY>, predicate :: ANY) :: least one element in the given
BOOLEAN
LIST<ANY>.

exists() exists(input :: ANY) :: BOOLEAN Returns true if a match for the pattern
exists in the graph.

isEmpty() isEmpty(input :: LIST<ANY> | MAP | Checks whether the given LIST<ANY>,


STRING ) :: BOOLEAN MAP, or STRING is empty.

none() none(variable :: ANY, list :: Returns true if the predicate holds for no
LIST<ANY>, predicate :: ANY) :: element in the given LIST<ANY>.
BOOLEAN

single() single(variable :: ANY, list :: Returns true if the predicate holds for
LIST<ANY>, predicate :: ANY) :: exactly one of the elements in the given
BOOLEAN
LIST<ANY>.

Scalar functions
These functions return a single value.

Function Signature Description

char_length() char_length(input :: STRING) :: Returns the number of Unicode


INTEGER characters in a STRING. New

character_length() character_length(input :: STRING) :: Returns the number of Unicode


INTEGER characters in a STRING. New

coalesce() coalesce(input :: ANY) :: ANY Returns the first non-null value in a list
of expressions.

elementId() elementId(input :: NODE | Returns a node or relationship identifier,


RELATIONSHIP) :: STRING unique within a specific transaction and
DBMS.

endNode() endNode(input :: RELATIONSHIP) :: Returns the end NODE of a RELATIONSHIP.


NODE

head() head(list :: LIST<ANY>) :: ANY Returns the first element in a LIST<ANY>.

id() id(input :: NODE | RELATIONSHIP) :: Deprecated Returns the id of a NODE or a


INTEGER RELATIONSHIP. Replaced by elementId().

last() last(list :: LIST<ANY>) :: ANY Returns the last element in a LIST<ANY>.

length() length(input :: PATH) :: INTEGER Returns the length of a PATH.

nullIf() nullIf(v1 :: ANY, v2 :: ANY) :: ANY Returns null if the two given
parameters are equivalent, otherwise
returns the value of the first parameter.

properties() properties(input :: NODE | Returns a MAP containing all the


RELATIONSHIP | MAP) :: MAP properties of a NODE or RELATIONSHIP.

354
Function Signature Description

randomUUID() randomUUID() :: STRING Generates a random UUID.

size() size(input STRING | LIST<ANY>) :: Returns the number of items in a


INTEGER LIST<ANY> or the number of Unicode
characters in a STRING.

startNode() startNode(input :: RELATIONSHIP) :: Returns the start NODE of a


NODE RELATIONSHIP.

toBoolean() toBoolean(input :: BOOLEAN | STRING Converts a BOOLEAN, STRING, or an


| INTEGER) :: BOOLEAN INTEGER value to a BOOLEAN value.

toBooleanOrNull() toBooleanOrNull(input :: ANY) :: Converts a value to a BOOLEAN value, or


BOOLEAN null if the value cannot be converted.

toFloat() toFloat(input :: STRING | INTEGER | Converts a STRING or INTEGER value to a


FLOAT) :: FLOAT FLOAT value.

toFloatOrNull() toFloatOrNull(input :: ANY) :: FLOAT Converts a value to a FLOAT value, or null


if the value cannot be converted.

toInteger() toInteger(input :: BOOLEAN | STRING Converts a BOOLEAN, `STRING, or FLOAT


| INTEGER | FLOAT) :: INTEGER value to an INTEGER value.

toIntegerOrNull() toIntegerOrNull(input :: ANY) :: Converts a value to an INTEGER value, or


INTEGER null if the value cannot be converted.

type() type(input :: RELATIONSHIP) :: Returns a STRING representation of the


STRING RELATIONSHIP type.

valueType() valueType(input :: ANY) :: STRING Returns a STRING representation of the


most precise value type that the given
expression evaluates to.

String functions
These functions are used to manipulate STRING values or to create a STRING representation of another
value.

Function Signature Description

btrim() btrim(original :: STRING [, Returns the given STRING with leading


trimCharacterString :: STRING ]) :: and trailing whitespace removed,
STRING
optionally specifying a
trimCharacterString value to remove.
New

left() left(original :: STRING, length :: Returns a STRING containing the


INTEGER) :: STRING specified number (INTEGER) of leftmost
characters in the given STRING.

lower() lower(input :: STRING) :: STRING Returns the given STRING in lowercase.


This function is an alias to the toLower()
function, and it was introduced as part
of Cypher’s GQL conformance. New

355
Function Signature Description

ltrim() ltrim(input :: STRING [, Returns the given STRING with leading


trimCharacterString :: STRING]) :: whitespace removed, optionally
STRING
specifying a trimCharacterString to
remove.

normalize() normalize(input :: STRING Normalizes a STRING, optionally


[,normalForm = NFC :: [NFC, NFD, specifying a normalization form. New
NFKC, NFKD]]) :: STRING

replace() replace(original :: STRING, search Returns a STRING in which all


:: STRING, replace :: STRING) :: occurrences of a specified search STRING
STRING
in the given STRING have been replaced
by another (specified) replacement
STRING.

reverse() reverse(input :: STRING) :: STRING Returns a STRING in which the order of


all characters in the given STRING have
been reversed.

right() right(original :: STRING, length :: Returns a STRING containing the


INTEGER) :: STRING specified number of rightmost
characters in the given STRING.

rtrim() rtrim(input :: STRING [, Returns the given STRING with trailing


trimCharacterString :: STRING]) :: whitespace removed, optionally
STRING
specifying a trimCharacterString of
characters to remove.

split() split(original :: STRING, Returns a LIST<STRING> resulting from


splitDelimiters :: LIST<STRING>) :: the splitting of the given STRING around
LIST<STRING>
matches of any of the given delimiters.

substring() substring(original :: STRING, start Returns a substring of a given length


:: INTEGER length :: INTEGER) :: from the given STRING, beginning with a
STRING
0-based index start.

toLower() toLower(input :: STRING) :: STRING Returns the given STRING in lowercase.

toString() toString(input :: ANY) :: STRING Converts an INTEGER, FLOAT, BOOLEAN,


POINT or temporal type (i.e. DATE, ZONED
TIME, LOCAL TIME, ZONED DATETIME, LOCAL
DATETIME or DURATION) value to a STRING.

toStringOrNull() toStringOrNull(input :: ANY) :: Converts an INTEGER, FLOAT, BOOLEAN,


STRING POINT or temporal type (i.e. DATE, ZONED
TIME, LOCAL TIME, ZONED DATETIME, LOCAL
DATETIME or DURATION) value to a STRING,
or null if the value cannot be converted.

toUpper() toUpper(input :: STRING) :: STRING Returns the given STRING in uppercase.

trim() trim(trimCharacterString :: STRING, Returns the given STRING with the


trimSpecification :: STRING, input leading and/or trailing
:: STRING) :: STRING
trimCharacterString character
removed.

356
Function Signature Description

upper() upper(input :: STRING) :: STRING Returns the given STRING in uppercase.


This function is an alias to the toUpper()
function, and it was introduced as part
of Cypher’s GQL conformance. New

Spatial functions
These functions are used to specify 2D or 3D points in a geographic or cartesian Coordinate Reference
System and to calculate the geodesic distance between two points.

Function Signature Description

point() point(input :: MAP) :: POINT Returns a 2D or 3D point object, given


two or respectively three coordinate
values in the Cartesian coordinate
system or WGS 84 geographic
coordinate system.

point.distance() point.distance(from :: POINT, to :: Returns a FLOAT representing the


POINT) :: FLOAT distance between any two points in the
same CRS. If the points are in the WGS
84 CRS, the function returns the
geodesic distance (i.e., the shortest path
along the curved surface of the Earth). If
the points are in a Cartesian CRS, the
function returns the Euclidean distance
(i.e., the shortest straight-line distance
in a flat, planar space).

point.withinBBox() point.withinBBox(point :: POINT, Returns true if the provided point is


lowerLeft :: POINT, upperRight :: within the bounding box defined by the
POINT) :: BOOLEAN
two provided points, lowerLeft and
upperRight.

Temporal duration functions


DURATION values of the temporal types can be created manipulated using the following functions:

Function Signature Description

duration() duration(input :: ANY) :: DURATION Constructs a DURATION value.

duration.between() duration.between(from :: ANY, to :: Computes the DURATION between the


ANY) :: DURATION from instant (inclusive) and the to
instant (exclusive) in logical units.

duration.inDays() duration.inDays(from :: ANY, to :: Computes the DURATION between the


ANY) :: DURATION from instant (inclusive) and the to
instant (exclusive) in days.

357
Function Signature Description

duration.inMonths() duration.inMonths(from :: ANY, to :: Computes the DURATION between the


ANY) :: DURATION from instant (inclusive) and the to
instant (exclusive) in months.

duration.inSeconds() duration.inSeconds(from :: ANY, to Computes the DURATION between the


:: ANY) :: DURATION from instant (inclusive) and the to
instant (exclusive) in seconds.

Temporal instant types functions


Values of the temporal types — DATE, ZONED TIME, LOCAL TIME, ZONED DATETIME, and LOCAL DATETIME — can
be created manipulated using the following functions:

Function Signature Description

date() date(input = Creates a DATE instant.


DEFAULT_TEMPORAL_ARGUMENT :: ANY) ::
DATE

date.realtime() date.realtime(timezone = Returns the current DATE instant using


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: the realtime clock.
DATE

date.statement() date.statement(timezone = Returns the current DATE instant using


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: the statement clock.
DATE

date.transaction() date.transaction(timezone = Returns the current DATE instant using


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: the transaction clock.
DATE

date.truncate() date.truncate(unit :: STRING, input Truncates the given temporal value to a


= DEFAULT_TEMPORAL_ARGUMENT :: ANY, DATE instant using the specified unit.
fields = null :: MAP) :: DATE

datetime() datetime(input = Creates a ZONED DATETIME instant.


DEFAULT_TEMPORAL_ARGUMENT :: ANY) ::
ZONED DATETIME

datetime.fromepoch() datetime.fromepoch(seconds :: Creates a ZONED DATETIME given the


INTEGER | FLOAT, nanoseconds :: seconds and nanoseconds since the
INTEGER | FLOAT) :: ZONED DATETIME
start of the epoch.

datetime.fromepochmillis() datetime.fromepochmillis(millisecond Creates a ZONED DATETIME given the


s :: INTEGER | FLOAT) :: ZONED milliseconds since the start of the epoch.
DATETIME

datetime.realtime() datetime.realtime(timezone = Returns the current ZONED DATETIME


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: instant using the realtime clock.
ZONED DATETIME

datetime.statement() datetime.statement(timezone = Returns the current ZONED DATETIME


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: instant using the statement clock.
ZONED DATETIME

datetime.transaction() datetime.transaction(timezone = Returns the current ZONED DATETIME


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: instant using the transaction clock.
ZONED DATETIME

datetime.truncate() datetime.truncate(unit :: STRING, Truncates the given temporal value to a


input = DEFAULT_TEMPORAL_ARGUMENT :: ZONED DATETIME instant using the
ANY, fields = null :: MAP) :: ZONED
DATETIME specified unit.

358
Function Signature Description

localdatetime() localdatetime(input = Creates a LOCAL DATETIME instant.


DEFAULT_TEMPORAL_ARGUMENT :: ANY) ::
LOCAL DATETIME

localdatetime.realtime() localdatetime.realtime(timezone = Returns the current LOCAL DATETIME


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: instant using the realtime clock.
LOCAL DATETIME

localdatetime.statement() localdatetime.statement(timezone = Returns the current LOCAL DATETIME


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: instant using the statement clock.
LOCAL DATETIME

localdatetime.transaction() localdatetime.transaction(timezone = Returns the current LOCAL DATETIME


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: instant using the transaction clock.
LOCAL DATETIME

localdatetime.truncate() localdatetime.truncate(unit :: Truncates the given temporal value to a


STRING, input = LOCAL DATETIME instant using the
DEFAULT_TEMPORAL_ARGUMENT :: ANY,
fields = null :: MAP) :: LOCAL specified unit.
DATETIME

localtime() localtime(input = Creates a LOCAL TIME instant.


DEFAULT_TEMPORAL_ARGUMENT :: ANY) ::
LOCAL TIME

localtime.realtime() localtime.realtime(timezone = Returns the current LOCAL TIME instant


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: using the realtime clock.
LOCAL TIME

localtime.statement() localtime.statement(timezone = Returns the current LOCAL TIME instant


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: using the statement clock.
LOCAL TIME

localtime.transaction() localtime.transaction(timezone = Returns the current LOCAL TIME instant


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: using the transaction clock.
LOCAL TIME

localtime.truncate() localtime.truncate(unit :: STRING, Truncates the given temporal value to a


input = DEFAULT_TEMPORAL_ARGUMENT :: LOCAL TIME instant using the specified
ANY, fields = null :: MAP) :: LOCAL
TIME unit.

time() time(input = Creates a ZONED TIME instant.


DEFAULT_TEMPORAL_ARGUMENT :: ANY) ::
ZONED TIME

time.realtime() time.realtime(timezone = Returns the current ZONED TIME instant


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: using the realtime clock.
ZONED TIME

time.statement() time.statement(timezone = Returns the current ZONED TIME instant


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: using the statement clock.
ZONED TIME

time.transaction() time.transaction(timezone = Returns the current ZONED TIME instant


DEFAULT_TEMPORAL_ARGUMENT :: ANY) :: using the transaction clock.
ZONED TIME

time.truncate() time.truncate(unit :: STRING, input Truncates the given temporal value to a


= DEFAULT_TEMPORAL_ARGUMENT :: ANY, ZONED TIME instant using the specified
fields = null :: MAP) :: ZONED TIME
unit.

User-defined functions
User-defined functions are written in Java, deployed into the database and are called in the same way as

359
any other Cypher function. There are two main types of functions that can be developed and used:

Type Description Usage Developing

Scalar For each row the function Using UDF Extending Neo4j (UDF)
takes parameters and returns
a result.

Aggregating Consumes many rows and Using aggregating UDF Extending Neo4j
produces an aggregated (Aggregating UDF)
result.

Vector functions Label—new 5.18


Vector functions allow you to compute the similarity scores of vector pairs.

Function Signature Description

vector.similarity.cosine() vector.similarity.cosine(a :: Returns a FLOAT representing the


LIST<INTEGER | FLOAT>, b :: similarity between the argument vectors
LIST<INTEGER | FLOAT>) :: FLOAT
based on their cosine.

vector.similarity.euclidean() vector.similarity.euclidean(a :: Returns a FLOAT representing the


LIST<INTEGER | FLOAT>, b :: similarity between the argument vectors
LIST<INTEGER | FLOAT>) :: FLOAT
based on their Euclidean distance.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/cypher-
aggregation/ad.adoc :table-caption!:

Aggregating functions
An aggregating function performs a calculation over a set of values, returning a single value. Aggregation
can be computed over all the matching paths, or it can be further divided by introducing grouping keys.

To learn more about how Cypher handles aggregations performed on zero rows, refer to
 Neo4j Knowledge Base → Understanding aggregations on zero rows.

Example graph
The following graph is used for the examples below:

360
name: 'Keanu Reeves'
age: 58

Person KNOWS Person ACTED_IN Movie


name: 'Carrie Anne Moss'
age: 55 title: 'Speed'

KN
O
W
KNOWS

KNOWS

S
Person

name: 'Kathryn Bigelow'


Person KNOWS Person age: 71

name: 'Guy Pearce'


age: 55 name: 'Liam Neeson'
age: 70

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(keanu:Person {name: 'Keanu Reeves', age: 58}),
(liam:Person {name: 'Liam Neeson', age: 70}),
(carrie:Person {name: 'Carrie Anne Moss', age: 55}),
(guy:Person {name: 'Guy Pearce', age: 55}),
(kathryn:Person {name: 'Kathryn Bigelow', age: 71}),
(speed:Movie {title: 'Speed'}),
(keanu)-[:ACTED_IN]->(speed),
(keanu)-[:KNOWS]->(carrie),
(keanu)-[:KNOWS]->(liam),
(keanu)-[:KNOWS]->(kathryn),
(carrie)-[:KNOWS]->(guy),
(liam)-[:KNOWS]->(guy)

avg()
Details

Syntax avg(input)

Description Returns the average of a set of INTEGER, FLOAT or DURATION values.

Arguments Name Type Description

input INTEGER | FLOAT | DURATION A value aggregated to form


an average.

Returns INTEGER | FLOAT | DURATION

Considerations

Any null values are excluded from the calculation.

361
avg(null) returns null.

Example 85. avg() - numerical values

Query

MATCH (p:Person)
RETURN avg(p.age)

The average of all the values in the property age is returned:

Result

avg(p.age)

61.8

Rows: 1

Example 86. avg() - duration values

Query

UNWIND [duration('P2DT3H'), duration('PT1H45S')] AS dur


RETURN avg(dur)

The average of the two supplied DURATION values is returned:

Result

avg(dur)

P1DT2H22.5S

Rows: 1

collect()
Details

Syntax collect(input)

Description Returns a list containing the values returned by an expression.

Arguments Name Type Description

input ANY A value aggregated into a list.

Returns LIST<ANY>

Considerations

Any null values are ignored and will not be added to the list.

collect(null) returns an empty list.

362
Example 87. collect()

Query

MATCH (p:Person)
RETURN collect(p.age)

All the values are collected and returned in a single list:

Result

collect(p.age)

[58, 70, 55, 55, 71]

Rows: 1

count()
Details

Syntax count(input)

Description Returns the number of values or rows.

Arguments Name Type Description

input ANY A value to be aggregated.

Returns INTEGER

Considerations

count(*) includes rows returning null.

count(input) ignores null values.

count(null) returns 0.

Neo4j maintains a transactional count store for holding count metadata, which can
significantly increase the speed of queries using the count() function. For more
 information about the count store, refer to Neo4j Knowledge Base → Fast counts using
the count store.

Using count(*) to return the number of nodes


The function count(*) can be used to return the number of nodes; for example, the number of nodes
connected to a node n.

363
Example 88. count()

Query

MATCH (p:Person {name: 'Keanu Reeves'})-->(x)


RETURN labels(p), p.age, count(*)

The labels and age property of the start node Keanu Reeves and the number of nodes related to it are
returned:

Result

labels(p) p.age count(*)

["Person"] 58 4

Rows: 1

Using count(*) to group and count relationship types


The function count(*) can be used to group the type of matched relationships and return the number of
types.

Example 89. count()

Query

MATCH (p:Person {name: 'Keanu Reeves'})-[r]->()


RETURN type(r), count(*)

The type of matched relationships are grouped and the group count of relationship types is returned:

Result

type(r) count(*)

"ACTED_IN" 1

"KNOWS" 3

Rows: 2

Counting non-null values


Instead of simply returning the number of rows with count(*), the function count(expression) can be
used to return the number of non-null values returned by the expression.

364
Example 90. count()

Query

MATCH (p:Person)
RETURN count(p.age)

The number of nodes with the label Person and a property age is returned: (To calculate the sum, use
sum(n.age))

Result

count(p.age)

Rows: 1

Counting with and without duplicates


The default behavior of the count function is to count all matching results, including duplicates. To avoid
counting duplicates, use the DISTINCT keyword.

As of Neo4j 5.15, it is also possible to use the ALL keyword with aggregating functions. This will count all
results, including duplicates, and is functionally the same as not using the DISTINCT keyword. The ALL
keyword was introduced as part of Cypher’s GQL conformance.

This example tries to find all friends of friends of Keanu Reeves and count them. It shows the behavior of
using both the ALL and the DISTINCT keywords:

Example 91. count()

Query

MATCH (p:Person)-->(friend:Person)-->(friendOfFriend:Person)
WHERE p.name = 'Keanu Reeves'
RETURN friendOfFriend.name, count(friendOfFriend), count(ALL friendOfFriend), count(DISTINCT
friendOfFriend)

The nodes Carrie Anne Moss and Liam Neeson both have an outgoing KNOWS relationship to Guy
Pearce. The Guy Pearce node will, therefore, get counted twice when not using DISTINCT.

Result

friendOfFriend.name count(friendOfFriend) count(ALL friendOfFriend) count(DISTINCT


friendOfFriend)

"Guy Pearce" 2 2 1

max()
Details

365
Syntax max(input)

Description Returns the maximum value in a set of values.

Arguments Name Type Description

input ANY A value to be aggregated.

Returns ANY

Considerations

Any null values are excluded from the calculation.

In a mixed set, any numeric value is always considered to be higher than any STRING value, and any STRING value is always
considered to be higher than any LIST<ANY>`.

Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to
the end.

max(null) returns null.

Example 92. max()

Query

UNWIND [1, 'a', null, 0.2, 'b', '1', '99'] AS val


RETURN max(val)

The highest of all the values in the mixed set — in this case, the numeric value 1 — is returned:

Result

max(val)

Rows: 1

The value '99' (a STRING), is considered to be a lower value than 1 (an INTEGER),
 because '99' is a STRING.

366
Example 93. max()

Query

UNWIND [[1, 'a', 89], [1, 2]] AS val


RETURN max(val)

The highest of all the lists in the set — in this case, the list [1, 2] — is returned, as the number 2 is
considered to be a higher value than the STRING 'a', even though the list [1, 'a', 89] contains more
elements.

Result

max(val)

[1,2]

Rows: 1

Example 94. max()

Query

MATCH (p:Person)
RETURN max(p.age)

The highest of all the values in the property age is returned:

Result

max(p.age)

71

Rows: 1

min()
Details

Syntax min(input)

Description Returns the minimum value in a set of values.

Arguments Name Type Description

input ANY A value to be aggregated.

Returns ANY

Considerations

Any null values are excluded from the calculation.

367
In a mixed set, any STRING value is always considered to be lower than any numeric value, and any LIST<ANY> is always
considered to be lower than any STRING.

Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to
the end.

min(null) returns null.

Example 95. min()

Query

UNWIND [1, 'a', null, 0.2, 'b', '1', '99'] AS val


RETURN min(val)

The lowest of all the values in the mixed set — in this case, the STRING value "1" — is returned. Note
that the (numeric) value 0.2, which may appear at first glance to be the lowest value in the list, is
considered to be a higher value than "1" as the latter is a STRING.

Result

min(val)

"1"

Rows: 1

Example 96. min()

Query

UNWIND ['d', [1, 2], ['a', 'c', 23]] AS val


RETURN min(val)

The lowest of all the values in the set — in this case, the list ['a', 'c', 23] — is returned, as (i) the
two lists are considered to be lower values than the STRING "d", and (ii) the STRING "a" is considered
to be a lower value than the numerical value 1.

Result

min(val)

["a","c",23]

Rows: 1

368
Example 97. min()

Query

MATCH (p:Person)
RETURN min(p.age)

The lowest of all the values in the property age is returned:

Result

min(p.age)

55

Rows: 1

percentileCont()
Details

Syntax percentileCont(input, percentile)

Description Returns the percentile of a value over a group using linear interpolation.

Arguments Name Type Description

input FLOAT A value to be aggregated.

percentile FLOAT A percentile between 0.0 and


1.0.

Returns FLOAT

Considerations

Any null values are excluded from the calculation.

percentileCont(null, percentile) returns null.

Example 98. percentileCont()

Query

MATCH (p:Person)
RETURN percentileCont(p.age, 0.4)

The 40th percentile of the values in the property age is returned, calculated with a weighted average:

Result

percentileCont(p.age, 0.4)

56.8

Rows: 1

369
percentileDisc()
Details

Syntax percentileDisc(input, percentile)

Description Returns the nearest INTEGER or FLOAT value to the given percentile over a group using a
rounding method.

Arguments Name Type Description

input INTEGER | FLOAT A value to be aggregated.

percentile FLOAT A percentile between 0.0 and


1.0.

Returns INTEGER | FLOAT

Considerations

Any null values are excluded from the calculation.

percentileDisc(null, percentile) returns null.

Example 99. percentileDisc()

Query

MATCH (p:Person)
RETURN percentileDisc(p.age, 0.5)

The 50th percentile of the values in the property age is returned:

Result

percentileDisc(p.age, 0.5)

58

Rows: 1

stDev()
Details

Syntax stDev(input)

Description Returns the standard deviation for the given value over a group for a sample of a population.

Arguments Name Type Description

input FLOAT The value to calculate the


standard deviation of.

Returns FLOAT

Considerations

370
Any null values are excluded from the calculation.

stDev(null) returns 0.

Example 100. stDev()

Query

MATCH (p:Person)
WHERE p.name IN ['Keanu Reeves', 'Liam Neeson', 'Carrie Anne Moss']
RETURN stDev(p.age)

The standard deviation of the values in the property age is returned:

Result

stDev(p.age)

7.937253933193772

Rows: 1

stDevP()
Details

Syntax stDevP(input)

Description Returns the standard deviation for the given value over a group for an entire population.

Arguments Name Type Description

input FLOAT The value to calculate the


population standard deviation
of.

Returns FLOAT

Considerations

Any null values are excluded from the calculation.

stDevP(null) returns 0.

371
Example 101. stDevP()

Query

MATCH (p:Person)
WHERE p.name IN ['Keanu Reeves', 'Liam Neeson', 'Carrie Anne Moss']
RETURN stDevP(p.age)

The population standard deviation of the values in the property age is returned:

Result

stDevP(p.age)

6.48074069840786

Rows: 1

sum()
Details

Syntax sum(input)

Description Returns the sum of a set of INTEGER, FLOAT or DURATION values

Arguments Name Type Description

input INTEGER | FLOAT | DURATION A value to be aggregated.

Returns INTEGER | FLOAT | DURATION

Considerations

Any null values are excluded from the calculation.

sum(null) returns 0.

Example 102. sum() - numeric values

Query

MATCH (p:Person)
RETURN sum(p.age)

The sum of all the values in the property age is returned:

Result

sum(p.age)

309

Rows: 1

372
Example 103. sum() - duration values

Query

UNWIND [duration('P2DT3H'), duration('PT1H45S')] AS dur


RETURN sum(dur)

The sum of the two supplied durations is returned:

Result

sum(dur)

P2DT4H45S

Rows: 1

Aggregating expressions and grouping keys


Aggregating expressions are expressions which contain one or more aggregating functions. A simple
aggregating expression consists of a single aggregating function. For instance, sum(x.a) is an aggregating
expression that only consists of the aggregating function sum( ) with x.a as its argument. Aggregating
expressions are also allowed to be more complex, where the result of one or more aggregating functions
are input arguments to other expressions. For instance, 0.1 * (sum(x.a) / count(x.b)) is an aggregating
expression that contains two aggregating functions, sum( ) with x.a as its argument and count( ) with
x.b as its argument. Both are input arguments to the division expression.

Grouping keys are non-aggregating expressions that are used to group the values going into the
aggregating functions. For example, given the following query containing two return expressions, n and
count(*):

RETURN n, count(*)

The first, n is not an aggregating function, so it will be the grouping key. The latter, count(*) is an
aggregating function. The matching paths will be divided into different buckets, depending on the
grouping key. The aggregating function will then be run on these buckets, calculating an aggregate value
per bucket.

The input expression of an aggregating function can contain any expression, including expressions that are
not grouping keys. However, not all expressions can be composed with aggregating functions. The
example below will throw an error since n.x, which is not a grouping key, is combined with the
aggregating function count(*).

RETURN n.x + count(*)

To sort the result set using aggregating functions, the aggregation must be included in the ORDER BY sub-
clause following the RETURN clause.

373
Examples
Example 104. Simple aggregation without any grouping keys

Query

MATCH (p:Person)
RETURN max(p.age)

Result

max(p.age)

71

Rows: 1

Example 105. Addition of an aggregation and a constant, without any grouping keys

Query

MATCH (p:Person)
RETURN max(p.age) + 1

Result

max(p.age) + 1

72

Rows: 1

Example 106. Subtraction of a property access and an aggregation

Note that p is a grouping key:

Query

MATCH (p:Person{name:'Keanu Reeves'})-[:KNOWS]-(f:Person)


RETURN p, p.age - max(f.age)

Result

p p.age - max(f.age)

{{"name":"Keanu Reeves","age":58}} -13

Rows: 1

374
Example 107. Subtraction of a property access and an aggregation.

Note that p.age is a grouping key:

Query

MATCH (p:Person {name:'Keanu Reeves'})-[:KNOWS]-(f:Person)


RETURN p.age, p.age - max(f.age)

Result

p.age p.age - max(f.age)

58 -13

Rows: 1

Grouping keys themselves can be complex expressions. For better query readability, Cypher only
recognizes a sub-expression in aggregating expressions as a grouping key if the grouping key is
either:

• A variable - e.g. the p in RETURN p, p.age - max(f.age).

• A property access - e.g. the p.age in RETURN p.age, p.age - max(f.age).

• A map access - e.g. the p.age in WITH {name:'Keanu Reeves', age:58} AS p RETURN p.age,
p.age - max(p.age).

If more complex grouping keys are needed as operands in aggregating expression, it is always
possible to project them in advance using WITH.

Using the property p.age will throw an exception, since p.age is not a grouping key. Therefore, it
cannot be used in the expressions which contain the aggregating function. The below two queries
would consequently return the same error message:

Query

MATCH (p:Person {name:'Keanu Reeves'})-[:KNOWS]-(f:Person)


RETURN p.age - max(f.age)

Query

MATCH (p:Person {name:'Keanu Reeves'})-[:KNOWS]-(f:Person)


RETURN p.age + p.age, p.age + p.age - max(f.age)

Error message

Aggregation column contains implicit grouping expressions. For example, in 'RETURN n.a, n.a + n.b +
count(*)' the aggregation expression 'n.a + n.b + count(*)' includes the implicit grouping key 'n.b'.
It may be possible to rewrite the query by extracting these grouping/aggregation expressions into a
preceding WITH clause. Illegal expression(s): n.age

However, the latter query would work if rewritten to:

375
Query

MATCH (p:Person {name:'Keanu Reeves'})-[:KNOWS]-(f:Person)


WITH p.age + p.age AS groupingKey, f
RETURN groupingKey, groupingKey - max(f.age)

Result

groupingKey groupingKey - max(f.age)

116 45

Rows: 1

Rules for aggregating expressions


For aggregating expressions to be correctly computable for the buckets formed by the grouping key(s),
they have to fulfill some requirements. Specifically, each sub-expression in an aggregating expression has
to be either:

• an aggregating function, e.g. sum(x.a).

• a constant, e.g. 0.1.

• a parameter, e.g. $param.

• a grouping key, e.g. the a in RETURN a, count(*).

• a local variable, e.g. the x in count(*) + size([ x IN range(1, 10) | x ]).

• a sub-expression, all operands of which have to be allowed in an aggregating expression.

Database functions

db.nameFromElementId()
Details

Syntax db.nameFromElementId(elementId)

Description Resolves the database name for the given element id.

Arguments Name Type Description

elementId STRING An element id of a node or


relationship.

Returns STRING

Considerations

The name of the database can only be returned if the provided element id belongs to a standard database in the DBMS.

376
Example 108. db.nameFromElementId()

Query

WITH "2:efc7577d-022a-107c-a736-dbcdfc189c03:0" AS eid


RETURN db.nameFromElementId(eid) AS name

Returns the name of the database which the element id belongs to.

Result

name

"neo4j"

Rows: 1

Graph functions

graph.names()
Details

Syntax graph.names()

Description Lists the names of graphs in the current database.

Returns LIST<STRING>

Considerations

graph.names() is only supported on composite databases.

377
Example 109. graph.names()

Setup

CREATE DATABASE dba;


CREATE DATABASE dbb;
CREATE DATABASE dbc;
CREATE COMPOSITE DATABASE composite;
CREATE ALIAS composite.first FOR DATABASE dba;
CREATE ALIAS composite.second FOR DATABASE dbb;
CREATE ALIAS composite.third FOR DATABASE dbc;

Query

RETURN graph.names() AS name

The names of all graphs on the current composite database are returned.

Result

name

"composite.first"

"composite.second"

"composite.third"

Rows: 3

graph.propertiesByName()
Details

Syntax graph.propertiesByName(graphName)

Description Returns the MAP of properties associated with a graph.

Arguments Name Type Description

graphName STRING The name of the graph from


which all associated
properties will be returned.

Returns MAP

Considerations

graph.propertiesByName() is only supported on composite databases.

The properties in the returned MAP are set on the aliasthat adds the graph as a constituent of a composite database.

378
Example 110. graph.propertiesByName()

Setup

CREATE DATABASE dba;


CREATE DATABASE dbb;
CREATE DATABASE dbc;
CREATE COMPOSITE DATABASE composite;
CREATE ALIAS composite.first FOR DATABASE dba
PROPERTIES {number: 1, tags: ['A', 'B']};
CREATE ALIAS composite.second FOR DATABASE dbb
PROPERTIES {number: 0, tags: ['A']};
CREATE ALIAS composite.third FOR DATABASE dbc
PROPERTIES {number: 2, tags: ['B', 'C']};

Query

UNWIND graph.names() AS name


RETURN name, graph.propertiesByName(name) AS props

Properties for all graphs on the current composite database are returned.

Result

name props

"composite.first" {number: 1, tags: ["A", "B"]}

"composite.second" {number: 0, tags: ["A"]}

"composite.third" {number: 2, tags: ["B", "C"]}

Rows: 3

Query

UNWIND graph.names() AS name


WITH name, graph.propertiesByName(name) AS props
WHERE "A" IN props.tags
CALL () {
USE graph.byName(name)
MATCH (n)
RETURN n
}
RETURN n

Returns all nodes from a subset of graphs that have a tags property containing "A".

The above query uses an empty variable scope clause: CALL () { … } (introduced

 in Neo4j 5.23). If you are using an older version of Neo4j, use CALL { … } instead.
For more information, see CALL subqueries → Importing variables.

graph.byName()
Details

Syntax graph.byName(name)

379
Description Returns the graph reference of the given name. It is only supported in the USE clause, on
composite databases.

Arguments Name Type Description

name STRING The name of the graph to be


resolved.

Returns GRAPH

Example 111. graph.byName()

Query

UNWIND graph.names() AS graphName


CALL () {
USE graph.byName(graphName)
MATCH (n)
RETURN n
}
RETURN n

Returns all nodes from all graphs on the current composite database.

The above query uses an empty variable scope clause: CALL () { … } (introduced

 in Neo4j 5.23). If you are using an older version of Neo4j, use CALL { … } instead.
For more information, see CALL subqueries → Importing variables.

graph.byElementId() Label—new 5.13


Details

Syntax graph.byElementId(elementId)

Description Returns the graph reference with the given element id. It is only supported in the USE clause,
on composite databases.

Arguments Name Type Description

elementId STRING An element id of a node or


relationship.

Returns GRAPH

Considerations

If the constituent database is not a standard database in the DBMS, an error will be thrown.

380
Example 112. graph.byElementId()

In this example, it is assumed that the DBMS contains a composite database constituent, which
contains the element id 4:c0a65d96-4993-4b0c-b036-e7ebd9174905:0.

Query

USE graph.byElementId("4:c0a65d96-4993-4b0c-b036-e7ebd9174905:0")
MATCH (n) RETURN n

List functions
List functions return lists of different data entities.

Further details and examples of lists may be found in Lists and List operators.

Example graph
The following graph is used for the examples below:

name: 'Bob' name: 'Alice'


age: 25 Administrator KNOWS Developer age: 38
eyes: 'Blue' eyes: 'Brown'

D
R RIE
MA
KNOWS

KNOWS
Designer

name: 'Eskil'
age: 41
eyes: 'Blue'
likedColors: ['Pink', 'Yellow', 'Black']
Administrator KNOWS Administrator

name: 'Daniel' name: 'Charlie'


age: 53 age: 53
eyes: 'Brown' eyes: 'Green'

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(alice:Developer {name:'Alice', age: 38, eyes: 'Brown'}),
(bob:Administrator {name: 'Bob', age: 25, eyes: 'Blue'}),
(charlie:Administrator {name: 'Charlie', age: 53, eyes: 'Green'}),
(daniel:Adminstrator {name: 'Daniel', age: 54, eyes: 'Brown'}),
(eskil:Designer {name: 'Eskil', age: 41, eyes: 'blue', likedColors: ['Pink', 'Yellow', 'Black']}),
(alice)-[:KNOWS]->(bob),
(alice)-[:KNOWS]->(charlie),
(bob)-[:KNOWS]->(daniel),
(charlie)-[:KNOWS]->(daniel),
(bob)-[:MARRIED]->(eskil)

381
keys()
Details

Syntax keys(input)

Description Returns a LIST<STRING> containing the STRING representations for all the property names of a
NODE, RELATIONSHIP or MAP.

Arguments Name Type Description

input NODE | RELATIONSHIP | MAP A node or relationship from


which the names of all
properties will be returned.

Returns LIST<STRING>

Considerations

keys(null) returns null.

Example 113. keys()

Query

MATCH (a) WHERE a.name = 'Alice'


RETURN keys(a)

A LIST<STRING> containing the names of all the properties on the node bound to a is returned.

Result

keys(a)

["eyes", "name", "age"]

Rows: 1

labels()
Details

Syntax labels(input)

Description Returns a LIST<STRING> containing the STRING representations for all the labels of a NODE.

Arguments Name Type Description

input NODE A node whose labels will be


returned.

Returns LIST<STRING>

Considerations

labels(null) returns null.

382
The order of the returned labels is not guaranteed when using the labels() function.

Example 114. labels()

Query

MATCH (a) WHERE a.name = 'Alice'


RETURN labels(a)

A LIST<STRING> containing all the labels of the node bound to a is returned.

Result

labels(a)

["Developer"]

Rows: 1

nodes()
Details

Syntax nodes(input)

Description Returns a LIST<NODE> containing all the NODE values in a PATH.

Arguments Name Type Description

input PATH A path whose nodes will be


returned.

Returns LIST<NODE>

Considerations

nodes(null) returns null.

383
Example 115. nodes()

Query

MATCH p = (a)-->(b)-->(c)
WHERE a.name = 'Alice' AND c.name = 'Eskil'
RETURN nodes(p)

A LIST<NODE> containing all the nodes in the path p is returned.

Result

nodes(p)

[(:Developer {name: "Alice", eyes: "Brown", age: 38}), (:Administrator {name: "Bob", eyes: "Blue", age:
25}), (:Designer {name: "Eskil", likedColors: ["Pink", "Yellow", "Black"], eyes: "blue", age: 41})]

Rows: 1

range()
Details

Syntax range(start, end [, step])

Description Returns a LIST<INTEGER> comprising all INTEGER values within a specified range created with
step length, optionally specifying a step length.

Arguments Name Type Description

start INTEGER The start value of the range.

end INTEGER The end value of the range.

step INTEGER The size of the increment


(default value: 1).

Returns LIST<INTEGER>

Considerations

To create ranges with decreasing INTEGER values, use a negative value step.

The range is inclusive for non-empty ranges, and the arithmetic progression will therefore always contain start
and — depending on the values of start, step and end — end. The only exception where the range does not contain start
are empty ranges.

An empty range will be returned if the value step is negative and start - end is positive, or vice versa, e.g. range(0, 5, -
1).

384
Example 116. range()

Query

RETURN range(0, 10), range(2, 18, 3), range(0, 5, -1)

Three lists of numbers in the given ranges are returned.

Result

range(0, 10) range(2, 18, 3) range(0, 5, -1)

[0,1,2,3,4,5,6,7,8,9,10] [2,5,8,11,14,17] []

Rows: 1

reduce()
Details

Syntax reduce(accumulator, variable)

Description Runs an expression against individual elements of a LIST<ANY>, storing the result of the
expression in an accumulator.

Arguments Name Type Description

accumulator ANY A variable that holds the


result as the list is iterated.

variable LIST<ANY> A variable that can be used


within the reducing
expression.

Returns ANY

This function is analogous to the fold or reduce method in functional languages such as Lisp and Scala.

Example 117. reduce()

Query

MATCH p = (a)-->(b)-->(c)
WHERE a.name = 'Alice' AND b.name = 'Bob' AND c.name = 'Daniel'
RETURN reduce(totalAge = 0, n IN nodes(p) | totalAge + n.age) AS reduction

The age property of all NODE values in the PATH are summed and returned as a single value.

Result

reduction

117

Rows: 1

385
relationships()
Details

Syntax relationships(input)

Description Returns a LIST<RELATIONSHIP> containing all the RELATIONSHIP values in a PATH.

Arguments Name Type Description

input PATH The path from which all


relationships will be returned.

Returns LIST<RELATIONSHIP>

Considerations

relationships(null) returns null.

Example 118. relationships()

Query

MATCH p = (a)-->(b)-->(c)
WHERE a.name = 'Alice' AND c.name = 'Eskil'
RETURN relationships(p)

A LIST<RELATIONSHIP> containing all the RELATIONSHIP values in the PATH p is returned.

Result

relationships(p)

[[:KNOWS], [:MARRIED]]

Rows: 1

reverse()
Details

Syntax reverse(input)

Description Returns a STRING or LIST<ANY> in which the order of all characters or elements in the given
STRING or LIST<ANY> have been reversed.

Arguments Name Type Description

input STRING | LIST<ANY> The string or list to be


reversed.

Returns STRING | LIST<ANY>

Considerations

Any null element in original is preserved.

386
See also String functions → reverse.

Example 119. reverse()

Query

WITH [4923,'abc',521, null, 487] AS ids


RETURN reverse(ids)

Result

reverse(ids)

[487,<null>,521,"abc",4923]

Rows: 1

tail()
Details

Syntax tail(input)

Description Returns all but the first element in a LIST<ANY>.

Arguments Name Type Description

input LIST<ANY> A list from which all but the


first element will be returned.

Returns LIST<ANY>

Example 120. tail()

Query

MATCH (a) WHERE a.name = 'Eskil'


RETURN a.likedColors, tail(a.likedColors)

The property named likedColors and a LIST<ANY> comprising all but the first element of the
likedColors property are returned.

Result

a.likedColors tail(a.likedColors)

["Pink", "Yellow", "Black"] ["Yellow", "Black"]

Rows: 1

toBooleanList()
Details

387
Syntax toBooleanList(input)

Description Converts a LIST<ANY> of values to a LIST<BOOLEAN> values. If any values are not convertible
to BOOLEAN they will be null in the LIST<BOOLEAN> returned.

Arguments Name Type Description

input LIST<ANY> A list of values to be


converted into a list of
booleans.

Returns LIST<BOOLEAN>

Considerations

Any null element in input is preserved.

Any BOOLEAN value in input is preserved.

If the input is null, null will be returned.

If the input is not a LIST<ANY>, an error will be returned.

The conversion for each value in list is done according to the toBooleanOrNull() function.

Example 121. toBooleanList()

Query

RETURN toBooleanList(null) as noList,


toBooleanList([null, null]) as nullsInList,
toBooleanList(['a string', true, 'false', null, ['A','B']]) as mixedList

Result

noList nullsInList mixedList

<null> [<null>,<null>] [<null>,true,false,<null>,<null>]

Rows: 1

toFloatList()
Details

Syntax toFloatList(input)

Description Converts a LIST<ANY> to a LIST<FLOAT> values. If any values are not convertible to FLOAT they
will be null in the LIST<FLOAT> returned.

Arguments Name Type Description

input LIST<ANY> A list of values to be


converted into a list of floats.

Returns LIST<FLOAT>

Considerations

388
Any null element in list is preserved.

Any FLOAT value in list is preserved.

If the input is null, null will be returned.

If the input is not a LIST<ANY>, an error will be returned.

The conversion for each value in input is done according to the toFloatOrNull() function.

Example 122. toFloatList()

Query

RETURN toFloatList(null) as noList,


toFloatList([null, null]) as nullsInList,
toFloatList(['a string', 2.5, '3.14159', null, ['A','B']]) as mixedList

Result

noList nullsInList mixedList

<null> [<null>,<null>] [<null>,2.5,3.14159,<null>,<null>]

Rows: 1

toIntegerList()
Details

Syntax toIntegerList(input)

Description Converts a LIST<ANY> to a LIST<INTEGER> values. If any values are not convertible to INTEGER
they will be null in the LIST<INTEGER> returned.

Arguments Name Type Description

input LIST<ANY> A list of values to be


converted into a list of
integers.

Returns LIST<INTEGER>

Considerations

Any null element in input is preserved.

Any INTEGER value in input is preserved.

If the input is null, null will be returned.

If the input is not a LIST<ANY>, an error will be returned.

The conversion for each value in list is done according to the toIntegerOrNull() function.

389
Example 123. toIntegerList()

Query

RETURN toIntegerList(null) as noList,


toIntegerList([null, null]) as nullsInList,
toIntegerList(['a string', 2, '5', null, ['A','B']]) as mixedList

Result

noList nullsInList mixedList

<null> [<null>,<null>] [<null>,2,5,<null>,<null>]

Rows: 1

toStringList()
Details

Syntax toStringList(input)

Description Converts a LIST<ANY> to a LIST<STRING> values. If any values are not convertible to STRING
they will be null in the LIST<STRING> returned.

Arguments Name Type Description

input LIST<ANY> A list of values to be


converted into a list of strings.

Returns LIST<STRING>

Considerations

Any null element in list is preserved.

Any STRING value in list is preserved.

If the list is null, null will be returned.

If the list is not a LIST<ANY>, an error will be returned.

The conversion for each value in list is done according to the toStringOrNull() function.

390
Example 124. toStringList()

Query

RETURN toStringList(null) as noList,


toStringList([null, null]) as nullsInList,
toStringList(['already a string', 2, date({year:1955, month:11, day:5}), null, ['A','B']]) as
mixedList

Result

noList nullsInList mixedList

<null> [<null>,<null>] ["already a string","2","1955-11-


05",<null>,<null>]

Rows: 1

LOAD CSV functions


LOAD CSV functions can be used to get information about the file that is processed by the LOAD CSV
clause.

The functions described on this page are only useful when run on a query that uses LOAD
 CSV. In all other contexts they will always return null.

linenumber()
Details

Syntax linenumber()

Description Returns the line number that LOAD CSV is currently using.

Returns INTEGER

Considerations

null will be returned if this function is called without a LOAD CSV context.

If the CSV file contains headers, the headers will be linenumber 1 and the 1st row of data will have a linenumber of 2.

file()
Details

Syntax file()

Description Returns the absolute path of the file that LOAD CSV is using.

Returns STRING

Considerations

391
null will be returned if this function is called without a LOAD CSV context.

Mathematical functions - logarithmic


Logarithmic mathematical functions operate on numeric expressions only, and will return an error if used
on any other values. See also Mathematical operators.

e()
Details

Syntax e()

Description Returns the base of the natural logarithm, e.

Returns FLOAT

Example 125. e()

Query

RETURN e()

The base of the natural logarithm, e, is returned.

Result

e()

2.718281828459045

Rows: 1

exp()
Details

Syntax exp(input)

Description Returns e^n, where e is the base of the natural logarithm, and n is the value of the argument
expression.

Arguments Name Type Description

input FLOAT A value to which the base of


the natural logarithm, e, will
be raised.

Returns FLOAT

Considerations

exp(null) returns null.

392
exp() returns Infinity when the return value is greater than the largest FLOAT value (Java Double.MAX_VALUE).

Example 126. exp()

Query

RETURN exp(2)

e to the power of 2 is returned.

Result

exp(2)

7.38905609893065

Rows: 1

log()
Details

Syntax log(input)

Description Returns the natural logarithm of a FLOAT.

Arguments Name Type Description

input FLOAT A value for which the natural


logarithm will be returned.

Returns FLOAT

Considerations

log(null) returns null.

log(0) returns -Infinity.

If (input < 0), then (log(input)) returns NaN.

393
Example 127. log()

Query

RETURN log(27)

The natural logarithm of 27 is returned.

Result

log(27)

3.295836866004329

Rows: 1

log10()
Details

Syntax log10(input)

Description Returns the common logarithm (base 10) of a FLOAT.

Arguments Name Type Description

input FLOAT A value for which the


common logarithm (base 10)
will be returned.

Returns FLOAT

Considerations

log10(null) returns null.

log10(0) returns -Infinity.

If (input < 0), then (log10(input)) returns NaN.

Example 128. log10()

Query

RETURN log10(27)

The common logarithm of 27 is returned.

Result

log10(27)

1.4313637641589874

Rows: 1

394
sqrt()
Details

Syntax sqrt(input)

Description Returns the square root of a FLOAT.

Arguments Name Type Description

input FLOAT The value to calculate the


square root of.

Returns FLOAT

Considerations

sqrt(null) returns null.

If (input < 0), then (sqrt(input)) returns NaN.

Example 129. sqrt()

Query

RETURN sqrt(256)

The square root of 256 is returned.

Result

sqrt(256)

16.0

Rows: 1

Mathematical functions - numeric


Numeric mathematical functions operate on numeric expressions only, and will return an error if used on
any other values. See also Mathematical operators.

Example graph
The following graph is used for the examples below:

395
name: 'Bob' name: 'Alice'
age: 25 Administrator KNOWS Developer age: 38
eyes: 'Blue' eyes: 'Brown'

D
R RIE
MA

KNOWS

KNOWS
Designer

name: 'Eskil'
age: 41
eyes: 'Blue'
likedColors: ['Pink', 'Yellow', 'Black']
Administrator KNOWS Administrator

name: 'Daniel' name: 'Charlie'


age: 53 age: 53
eyes: 'Brown' eyes: 'Green'

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(alice:Developer {name:'Alice', age: 38, eyes: 'Brown'}),
(bob:Administrator {name: 'Bob', age: 25, eyes: 'Blue'}),
(charlie:Administrator {name: 'Charlie', age: 53, eyes: 'Green'}),
(daniel:Adminstrator {name: 'Daniel', age: 54, eyes: 'Brown'}),
(eskil:Designer {name: 'Eskil', age: 41, eyes: 'blue', likedColors: ['Pink', 'Yellow', 'Black']}),
(alice)-[:KNOWS]->(bob),
(alice)-[:KNOWS]->(charlie),
(bob)-[:KNOWS]->(daniel),
(charlie)-[:KNOWS]->(daniel),
(bob)-[:MARRIED]->(eskil)

abs()
Details

Syntax abs(input)

Description Returns the absolute value of an INTEGER or FLOAT.

Arguments Name Type Description

input INTEGER | FLOAT A numeric value from which


the absolute number will be
returned.

Returns INTEGER | FLOAT

Considerations

abs(null) returns null.

If input is negative, -(input) (i.e. the negation of input) is returned.

396
Example 130. abs()

Query

MATCH (a), (e) WHERE a.name = 'Alice' AND e.name = 'Eskil'


RETURN a.age, e.age, abs(a.age - e.age)

The absolute value of the age difference is returned.

Result

a.age e.age abs(a.age - e.age)

38 41 3

Rows: 1

ceil()
Details

Syntax ceil(input)

Description Returns the smallest FLOAT that is greater than or equal to a number and equal to an INTEGER.

Arguments Name Type Description

input FLOAT A value to be rounded to the


nearest higher integer.

Returns FLOAT

Considerations

ceil(null) returns null.

Example 131. ceil()

Query

RETURN ceil(0.1)

The ceil of 0.1 is returned.

Result

ceil(0.1)

1.0

Rows: 1

397
floor()
Details

Syntax floor(input)

Description Returns the largest FLOAT that is less than or equal to a number and equal to an INTEGER.

Arguments Name Type Description

input FLOAT A value to be rounded to the


nearest lower integer.

Returns FLOAT

Considerations

floor(null) returns null.

Example 132. floor()

Query

RETURN floor(0.9)

The floor of 0.9 is returned.

Result

floor(0.9)

0.0

Rows: 1

isNaN()
Details

Syntax isNaN(input)

Description Returns whether the given INTEGER or FLOAT is NaN.

Arguments Name Type Description

input INTEGER | FLOAT A numeric value to be


compared against NaN.

Returns BOOLEAN

Considerations

isNaN(null) returns null.

398
Example 133. isNaN()

Query

RETURN isNaN(0/0.0)

true is returned since the value is NaN.

Result

isNaN(0/0.0)

true

Rows: 1

rand()
Details

Syntax rand()

Description Returns a random FLOAT in the range from 0 (inclusive) to 1 (exclusive).

Returns FLOAT

Example 134. rand()

Query

RETURN rand()

A random number is returned.

Result

rand()

0.5460251846326871

Rows: 1

round()
Details

Syntax round(value[, precision, mode])

Description Returns the value of a rounded number, optionally using a specified precision and rounding
mode.

399
Arguments Name Type Description

value FLOAT A value to be rounded.

precision INTEGER | FLOAT The rounding precision.

mode STRING A precision rounding mode


(UP, DOWN, CEILING, FLOOR,
HALF_UP, HALF_DOWN,
HALF_EVEN).

Returns FLOAT

Modes
mode Description

UP Round away from zero.

DOWN Round towards zero.

CEILING Round towards positive infinity.

FLOOR Round towards negative infinity.

HALF_UP Round towards closest value of given precision, with ties


always being rounded away from zero.

HALF_DOWN Round towards closest value of given precision, with ties


always being rounded towards zero.

HALF_EVEN Round towards closest value of given precision, with ties


always being rounded to the even neighbor.

Considerations

For the rounding modes, a tie means that the two closest values of the given precision are at the same distance from the
given value. E.g. for precision 1, 2.15 is a tie as it has equal distance to 2.1 and 2.2, while 2.151 is not a tie, as it is closer to
2.2.

round() returns null if any of its input parameters are null.

Example 135. round()

Query

RETURN round(3.141592)

3.0 is returned.

Result

round(3.141592)

3.0

Rows: 1

400
Example 136. round() of negative number with tie

Query

RETURN round(-1.5)

Ties are rounded towards positive infinity, therefore -1.0 is returned.

Result

round(-1.5)

-1.0

Rows: 1

round() with precision


Example 137. round() with precision

Query

RETURN round(3.141592, 3)

3.142 is returned.

Result

round(3.141592, 3)

3.142

Rows: 1

Example 138. round() with precision 0 and tie

Query

RETURN round(-1.5, 0)

To align with round(-1.5), -1.0 is returned.

Result

round(-1.5, 0)

-1.0

Rows: 1

401
Example 139. round() with precision 1 and tie

Query

RETURN round(-1.55, 1)

The default is to round away from zero when there is a tie, therefore -1.6 is returned.

Result

round(-1.55, 1)

-1.6

Rows: 1

round() with precision and rounding mode


Example 140. round() with precision and UP rounding mode

Query

RETURN round(1.249, 1, 'UP') AS positive,


round(-1.251, 1, 'UP') AS negative,
round(1.25, 1, 'UP') AS positiveTie,
round(-1.35, 1, 'UP') AS negativeTie

The rounded values using precision 1 and rounding mode UP are returned.

Result

positive negative positiveTie negativeTie

1.3 -1.3 1.3 -1.4

Rows: 1

402
Example 141. round() with precision and DOWN rounding mode

Query

RETURN round(1.249, 1, 'DOWN') AS positive,


round(-1.251, 1, 'DOWN') AS negative,
round(1.25, 1, 'DOWN') AS positiveTie,
round(-1.35, 1, 'DOWN') AS negativeTie

The rounded values using precision 1 and rounding mode DOWN are returned.

Result

positive negative positiveTie negativeTie

1.2 -1.2 1.2 +-1.3

Rows: 1

Example 142. round() with precision and CEILING rounding mode

Query

RETURN round(1.249, 1, 'CEILING') AS positive,


round(-1.251, 1, 'CEILING') AS negative,
round(1.25, 1, 'CEILING') AS positiveTie,
round(-1.35, 1, 'CEILING') AS negativeTie

The rounded values using precision 1 and rounding mode CEILING are returned.

Result

positive negative positiveTie negativeTie

1.3 -1.2 1.3 -1.3

Rows: 1

Example 143. round() with precision and FLOOR rounding mode

Query

RETURN round(1.249, 1, 'FLOOR') AS positive,


round(-1.251, 1, 'FLOOR') AS negative,
round(1.25, 1, 'FLOOR') AS positiveTie,
round(-1.35, 1, 'FLOOR') AS negativeTie

The rounded values using precision 1 and rounding mode FLOOR are returned.

Result

positive negative positiveTie negativeTie

1.2 -1.3 1.2 -1.4

Rows: 1

403
Example 144. round() with precision and HALF_UP rounding mode

Query

RETURN round(1.249, 1, 'HALF_UP') AS positive,


round(-1.251, 1, 'HALF_UP') AS negative,
round(1.25, 1, 'HALF_UP') AS positiveTie,
round(-1.35, 1, 'HALF_UP') AS negativeTie

The rounded values using precision 1 and rounding mode HALF_UP are returned.

Result

positive negative positiveTie negativeTie

1.2 -1.3 1.3 -1.4

Rows: 1

Example 145. round() with precision and HALF_DOWN rounding mode

Query

RETURN round(1.249, 1, 'HALF_DOWN') AS positive,


round(-1.251, 1, 'HALF_DOWN') AS negative,
round(1.25, 1, 'HALF_DOWN') AS positiveTie,
round(-1.35, 1, 'HALF_DOWN') AS negativeTie

The rounded values using precision 1 and rounding mode HALF_DOWN are returned.

Result

positive negative positiveTie negativeTie

1.2 -1.3 1.2 -1.3

Rows: 1

Example 146. round() with precision and HALF_EVEN rounding mode

Query

RETURN round(1.249, 1, 'HALF_EVEN') AS positive,


round(-1.251, 1, 'HALF_EVEN') AS negative,
round(1.25, 1, 'HALF_EVEN') AS positiveTie,
round(-1.35, 1, 'HALF_EVEN') AS negativeTie

The rounded values using precision 1 and rounding mode HALF_EVEN are returned.

Result

positive negative positiveTie negativeTie

1.2 -1.3 1.2 -1.4

Rows: 1

404
sign()
Details

Syntax sign(input)

Description Returns the signum of an INTEGER or FLOAT: 0 if the number is 0, -1 for any negative number,
and 1 for any positive number.

Arguments Name Type Description

input INTEGER | FLOAT A positive or negative


number.

Returns INTEGER

Considerations

sign(null) returns null.

Example 147. sign()

Query

RETURN sign(-17), sign(0.1)

The signs of -17 and 0.1 are returned.

Result

sign(-17) sign(0.1)

-1 1

Rows: 1

Mathematical functions - trigonometric


Trigonometric mathematical functions operate on numeric expressions only, and will return an error if used
on any other values. See also Mathematical operators.

acos()
Details

Syntax acos(input)

Description Returns the arccosine of a FLOAT in radians.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

405
Considerations

acos(null) returns null.

If (input < -1) or (input > 1), then (acos(input)) returns NaN.

Example 148. acos()

Query

RETURN acos(0.5)

The arccosine of 0.5 is returned.

Result

acos(0.5)

1.0471975511965979

Rows: 1

asin()
Details

Syntax asin(input)

Description Returns the arcsine of a FLOAT in radians.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

Considerations

asin(null) returns null.

If (input < -1) or (input > 1), then (asin(input)) returns NaN.

406
Example 149. asin()

Query

RETURN asin(0.5)

The arcsine of 0.5 is returned.

Result

asin(0.5)

0.5235987755982989

Rows: 1

atan()
Details

Syntax atan(input)

Description Returns the arctangent of a FLOAT in radians.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

Considerations

atan(null) returns null.

Example 150. atan()

Query

RETURN atan(0.5)

The arctangent of 0.5 is returned.

Result

atan(0.5)

0.4636476090008061

Rows: 1

atan2()
Details

407
Syntax atan2(y, x)

Description Returns the arctangent2 of a set of coordinates in radians.

Arguments Name Type Description

y FLOAT A y angle in radians.

x FLOAT An x angle in radians.

Returns FLOAT

Considerations

atan2(null, null), atan2(null, x) and atan(y, null) all return null.

Example 151. atan2()

Query

RETURN atan2(0.5, 0.6)

The arctangent2 of 0.5 and 0.6 is returned.

Result

atan2(0.5, 0.6)

0.6947382761967033

Rows: 1

cos()
Details

Syntax cos(input)

Description Returns the cosine of a FLOAT in radians.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

Considerations

cos(null) returns null.

408
Example 152. cos()

Query

RETURN cos(0.5)

The cosine of 0.5 is returned.

Result

cos(0.5)

0.8775825618903728

Rows: 1

cot()
Details

Syntax cot(input)

Description Returns the cotangent of a FLOAT.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

Considerations

cot(null) returns null.

cot(0) returns Infinity.

Example 153. cot()

Query

RETURN cot(0.5)

The cotangent of 0.5 is returned.

Result

cot(0.5)

1.830487721712452

Rows: 1

409
degrees()
Details

Syntax degrees(input)

Description Converts radians to degrees.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

Considerations

degrees(null) returns null.

Example 154. degrees

Query

RETURN degrees(3.14159)

The number of degrees in something close to pi is returned.

Result

degrees(3.14159)

179.9998479605043

Rows: 1

haversin()
Details

Syntax haversin(input)

Description Returns half the versine of a number.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

Considerations

haversin(null) returns null.

410
Example 155. haversin()

Query

RETURN haversin(0.5)

The haversine of 0.5 is returned.

Result

haversin(0.5)

0.06120871905481362

Rows: 1

Spherical distance using the haversin() function


The haversin() function may be used to compute the distance on the surface of a sphere between two
points (each given by their latitude and longitude).

Example 156. haversin()

In this example the spherical distance (in km) between Berlin in Germany (at lat 52.5, lon 13.4) and
San Mateo in California (at lat 37.5, lon -122.3) is calculated using an average earth radius of 6371
km.

Query

CREATE (ber:City {lat: 52.5, lon: 13.4}), (sm:City {lat: 37.5, lon: -122.3})
RETURN 2 * 6371 * asin(sqrt(haversin(radians( sm.lat - ber.lat ))
+ cos(radians( sm.lat )) * cos(radians( ber.lat )) *
haversin(radians( sm.lon - ber.lon )))) AS dist

The estimated distance between 'Berlin' and 'San Mateo' is returned.

Result

dist

9129.969740051658

Rows: 1

pi()
Details

Syntax pi()

Description Returns the mathematical constant pi.

Returns FLOAT

411
Example 157. pi()

Query

RETURN pi()

The constant pi is returned.

Result

pi()

3.141592653589793

Rows: 1

radians()
Details

Syntax radians(input)

Description Converts degrees to radians.

Arguments Name Type Description

input FLOAT An angle in degrees.

Returns FLOAT

Considerations

radians(null) returns null.

Example 158. radians()

Query

RETURN radians(180)

The number of radians in 180 degrees is returned (pi).

Result

radians(180)

3.141592653589793

Rows: 1

sin()
Details

412
Syntax sin(input)

Description Returns the sine of a FLOAT.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

Considerations

sin(null) returns null.

Example 159. sin()

Query

RETURN sin(0.5)

The sine of 0.5 is returned.

Result

sin(0.5)

0.479425538604203

Rows: 1

tan()
Details

Syntax tan(input)

Description Returns the tangent of a FLOAT.

Arguments Name Type Description

input FLOAT An angle in radians.

Returns FLOAT

Considerations

tan(null) returns null.

413
Example 160. tan()

Query

RETURN tan(0.5)

The tangent of 0.5 is returned.

Result

tan(0.5)

0.5463024898437905

Rows: 1

Predicate functions

Introduction
Predicates are boolean functions that return true or false for a given set of non-null input. They are most
commonly used to filter out paths in the WHERE part of a query.

Example graph
The following graph is used for the examples below:

name: 'Carrie Anne Moss'


age: 55 name: 'Keanu Reeves' name: 'Jessica Chastain'
nationality: 'American' age: 58 age: 45
nationality: 'Canadian' address:''

Person KNOWS Person Person

AC
N
TED D _I
_I
N C TE
A
KNOWS

KNOWS

KN

S
OW

OW
S

KN

Movie

title: 'The Matrix'

Person KNOWS Person Person


name: 'Guy Pearce'
age: 55 name: 'Liam Neeson' name: 'Kathryn Bigelow'
nationality: 'Australian' age: 70 age: 71
nationality: 'Northern Irish' nationality: 'American'

To recreate it, run the following query against an empty Neo4j database:

414
CREATE
(keanu:Person {name:'Keanu Reeves', age:58, nationality:'Canadian'}),
(carrie:Person {name:'Carrie Anne Moss', age:55, nationality:'American'}),
(liam:Person {name:'Liam Neeson', age:70, nationality:'Northern Irish'}),
(guy:Person {name:'Guy Pearce', age:55, nationality:'Australian'}),
(kathryn:Person {name:'Kathryn Bigelow', age:71, nationality:'American'}),
(jessica:Person {name:'Jessica Chastain', age:45, address:''}),
(theMatrix:Movie {title:'The Matrix'}),
(keanu)-[:KNOWS]->(carrie),
(keanu)-[:KNOWS]->(liam),
(keanu)-[:KNOWS]->(kathryn),
(kathryn)-[:KNOWS]->(jessica),
(carrie)-[:KNOWS]->(guy),
(liam)-[:KNOWS]->(guy),
(keanu)-[:ACTED_IN]->(theMatrix),
(carrie)-[:ACTED_IN]->(theMatrix)

all()
Details

Syntax all(variable IN list WHERE predicate)

Description Returns true if the predicate holds for all elements in the given LIST<ANY>.

Arguments Name Type Description

variable ANY A variable that can be used


within the WHERE clause.

list LIST<ANY> A predicate must hold for all


elements in this list for the
function to return true.

predicate ANY A predicate that is tested


against all items in the given
list.

Returns BOOLEAN

Considerations

null is returned if the list is null or if the predicate evaluates to null for at least one element and does not evaluate to
false for any other element.

415
Example 161. all()

Query

MATCH p = (a)-[*]->(b)
WHERE
a.name = 'Keanu Reeves'
AND b.name = 'Guy Pearce'
AND all(x IN nodes(p) WHERE x.age < 60)
RETURN p

All nodes in the returned paths will have a property age with a value lower than 60:

Carrie Guy
Anne KNOWS
Pearce
Moss
KNOWS

Keanu
Reeves

Result

(:Person {nationality: "Canadian",name: "Keanu Reeves",age: 58})-[:KNOWS]->(:Person {nationality:


"American",name: "Carrie Anne Moss",age: 55})-[:KNOWS]->(:Person {nationality: "Australian",name: "Guy
Pearce",age: 55})

Rows: 1

any()
Details

Syntax any(variable IN list WHERE predicate)

Description Returns true if the predicate holds for at least one element in the given LIST<ANY>.

416
Arguments Name Type Description

variable ANY A variable that can be used


within the WHERE clause.

list LIST<ANY> A predicate must hold for all


elements in this list for the
function to return true.

predicate ANY A predicate that is tested


against all items in the given
list.

Returns BOOLEAN

Considerations

null is returned if the list is null or if the predicate evaluates to null for at least one element and does not evaluate to
false for any other element.

Example 162. any()

Query

MATCH (p:Person)
WHERE any(nationality IN p.nationality WHERE nationality = 'American')
RETURN p

The query returns the Person nodes with the nationality property value American:

Result

{"nationality":"American","name":"Carrie Anne Moss","age":55}

{"nationality":"American","name":"Kathryn Bigelow","age":71}

Rows: 2

exists()
Details

Syntax exists(input)

Description Returns true if a match for the pattern exists in the graph.

Arguments Name Type Description

input ANY A pattern to verify the


existence of.

Returns BOOLEAN

Considerations

417
null is returned if input is null.

 To check if a property is not null use the IS NOT NULL predicate.

Example 163. exists()

Query

MATCH (p:Person)
RETURN
p.name AS name,
exists((p)-[:ACTED_IN]->()) AS has_acted_in_rel

This query returns the name property of every Person node, along with a boolean (true or false)
indicating if those nodes have an ACTED_IN relationship in the graph.

Result

name has_acted_in_rel

"Carrie Anne Moss" true

"Keanu Reeves" true

"Liam Neeson" false

"Guy Pearce" false

"Kathryn Bigelow" false

"Jessica Chastain" false

Rows: 6

For information about the EXISTS subquery, which is more versatile than the exists()
 function, see EXISTS subqueries.

isEmpty()
Details

Syntax isEmpty(input)

Description Checks whether a STRING, MAP or LIST<ANY> is empty.

Arguments Name Type Description

input STRING | MAP | LIST<ANY> A value to be checked for


emptiness.

Returns BOOLEAN

418
Example 164. isEmpty(list)

Query

MATCH (p:Person)
WHERE NOT isEmpty(p.nationality)
RETURN p.name, p.nationality

This query returns every Person node in the graph with a set nationality property value (i.e., all
Person nodes except for Jessica Chastain):

Result

p.name p.nationality

"Keanu Reeves" "Canadian"

"Carrie Anne Moss" "American"

"Liam Neeson" "Northern Irish"

"Guy Pearce" "Australian"

"Kathryn Bigelow" "American"

Rows: 5

Example 165. isEmpty(map)

Query

MATCH (n)
WHERE isEmpty(properties(n))
RETURN n

Because the example graph contains no empty nodes, nothing is returned:

Result

(no changes, no records)

419
Example 166. isEmpty(string)

Query

MATCH (p:Person)
WHERE isEmpty(p.address)
RETURN p.name AS name

The name property of each node that has an empty STRING address property is returned:

Result

name

"Jessica Chastain"

Rows: 1

The function isEmpty(), like most other Cypher functions, returns null if null is passed
in to the function. That means that a predicate isEmpty(n.address) will filter out all
 nodes where the address property is not set. Thus, isEmpty() is not suited to test for
null-values. IS NULL or IS NOT NULL should be used for that purpose.

none()
Details

Syntax none(variable IN list WHERE predicate)

Description Returns true if the predicate holds for no element in the given LIST<ANY>.

Arguments Name Type Description

variable ANY A variable that can be used


within the WHERE clause.

list LIST<ANY> A predicate must hold for all


elements in this list for the
function to return true.

predicate ANY A predicate that is tested


against all items in the given
list.

Returns BOOLEAN

Considerations

null is returned if the list is null, or if the predicate evaluates to null for at least one element and does not evaluate to
true for any other element.

420
Example 167. none()

Query

MATCH p = (n)-[*]->(b)
WHERE
n.name = 'Keanu Reeves'
AND none(x IN nodes(p) WHERE x.age > 60)
RETURN p

No node in the returned path has an age property with a greater value than 60:

Carrie Guy
Anne KNOWS
Pearce
Moss
KNOWS

Keanu
Reeves

Result

(:Person {nationality: "Canadian",name: "Keanu Reeves",age: 58})-[:KNOWS]→(:Person {nationality:


"American",name: "Carrie Anne Moss",age: 55})

(:Person {nationality: "Canadian",name: "Keanu Reeves",age: 58})-[:KNOWS]→(:Person {nationality:


"American",name: "Carrie Anne Moss",age: 55})-[:KNOWS]→(:Person {nationality: "Australian",name: "Guy
Pearce",age: 55})

Rows: 2

single()
Details

Syntax single(variable IN list WHERE predicate)

Description Returns true if the predicate holds for exactly one of the elements in the given LIST<ANY>.

421
Arguments Name Type Description

variable ANY A variable that can be used


within the WHERE clause.

list LIST<ANY> A predicate must hold for all


elements in this list for the
function to return true.

predicate ANY A predicate that is tested


against all items in the given
list.

Returns BOOLEAN

Considerations

null is returned if the list is null, or if the predicate evaluates to null for at least one element and does not evaluate to
true for any other element.

Example 168. single()

Query

MATCH p = (n)-->(b)
WHERE
n.name = 'Keanu Reeves'
AND single(x IN nodes(p) WHERE x.nationality = 'Northern Irish')
RETURN p

In every returned path there is exactly one node which has the nationality property value Northern
Irish:

Result

(:Person {nationality: "Canadian",name: "Keanu Reeves",age: 58})-[:KNOWS]→(:Person {nationality:


"Northern Irish",name: "Liam Neeson",age: 70})

Rows: 1

Scalar functions
Scalar functions return a single value.

Example graph
The following graph is used for the examples below:

422
name: 'Bob' name: 'Alice'
age: 25 Administrator KNOWS Developer age: 38
eyes: 'Blue' eyes: 'Brown'

D
R RIE
MA

KNOWS

KNOWS
Designer

name: 'Eskil'
age: 41
eyes: 'Blue'
likedColors: ['Pink', 'Yellow', 'Black']
Administrator KNOWS Administrator

name: 'Daniel' name: 'Charlie'


age: 53 age: 53
eyes: 'Brown' eyes: 'Green'

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(alice:Developer {name:'Alice', age: 38, eyes: 'Brown'}),
(bob:Administrator {name: 'Bob', age: 25, eyes: 'Blue'}),
(charlie:Administrator {name: 'Charlie', age: 53, eyes: 'Green'}),
(daniel:Adminstrator {name: 'Daniel', age: 54, eyes: 'Brown'}),
(eskil:Designer {name: 'Eskil', age: 41, eyes: 'blue', likedColors: ['Pink', 'Yellow', 'Black']}),
(alice)-[:KNOWS]->(bob),
(alice)-[:KNOWS]->(charlie),
(bob)-[:KNOWS]->(daniel),
(charlie)-[:KNOWS]->(daniel),
(bob)-[:MARRIED]->(eskil)

char_length() Label—new 5.13


Details

Syntax char_length(input)

Description Returns the number of Unicode characters in a STRING.

Arguments Name Type Description

input STRING A string value whose length


in characters is to be
calculated.

Returns INTEGER

This function is an alias of the size() function, and was introduced as part of Cypher’s GQL conformance.

Considerations

char_length(null) returns null.

423
Example 169. char_length()

Query

RETURN char_length('Alice')

Result

char_length('Alice')

Rows: 1

The number of Unicode characters in the STRING is returned.

character_length() Label—new 5.13


Details

Syntax character_length(input)

Description Returns the number of Unicode characters in a STRING.

Arguments Name Type Description

input STRING A string value whose length


in characters is to be
calculated.

Returns INTEGER

This function is an alias of the size() function, and was introduced as part of Cypher’s GQL conformance.

Considerations

character_length(null) returns null.

Example 170. character_length()

Query

RETURN character_length('Alice')

Result

character_length('Alice')

Rows: 1

The number of Unicode characters in the STRING is returned.

424
coalesce()
Details

Syntax coalesce(input)

Description Returns the first non-null value in a list of expressions.

Arguments Name Type Description

input ANY If this is the first non-NULL


value, it will be returned.

Returns ANY

Considerations

null will be returned if all the arguments are null.

Example 171. coalesce()

Query

MATCH (a)
WHERE a.name = 'Alice'
RETURN coalesce(a.hairColor, a.eyes)

Result

coalesce(a.hairColor, a.eyes)

"Brown"

Rows: 1

elementId()
Details

Syntax elementId(input)

Description Returns the element id of a NODE or RELATIONSHIP.

Arguments Name Type Description

input NODE | RELATIONSHIP An element id of a node or a


relationship.

Returns STRING

There are important considerations to bear in mind when using elementId():

1. Every node and relationship is guaranteed an element ID. This ID is unique among both nodes and
relationships across all databases in the same DBMS within the scope of a single transaction.
However, no guarantees are given regarding the order of the returned ID values or the length of the ID
STRING values. Outside of the scope of a single transaction, no guarantees are given about the mapping

425
between ID values and elements.

2. Neo4j reuses its internal IDs when nodes and relationships are deleted. Applications relying on internal
Neo4j IDs are, as a result, brittle and can be inaccurate. It is therefore recommended to use application-
generated IDs.

Considerations

elementId(null) returns null.

elementId on values other than a NODE, RELATIONSHIP, or null will fail the query.

Example 172. elementId() for nodes

Query

MATCH (n:Developer)
RETURN elementId(n)

The identifier for each Developer node is returned.

Result

elementId(n)

"4:d8d172ec-96d8-4364-8f5d-9353d776aeb3:0"

Rows: 1

Example 173. elementId() for relationships

Query

MATCH (:Developer)-[r]-()
RETURN elementId(r)

The identifier for each relationship connected to a Developer node is returned.

Result

elementId(r)

"5:d8d172ec-96d8-4364-8f5d-9353d776aeb3:0"

"5:d8d172ec-96d8-4364-8f5d-9353d776aeb3:1"

Rows: 2

endNode()
Details

Syntax endNode(input)

Description Returns the end NODE of a RELATIONSHIP.

426
Arguments Name Type Description

input RELATIONSHIP A relationship.

Returns NODE

Considerations

endNode(null) returns null.

Example 174. endNode()

Query

MATCH (x:Developer)-[r]-()
RETURN endNode(r)

Result

endNode(r)

{name: "Bob", age: 25, eyes: "Blue"}

{name: "Charlie", age: 53, eyes: "Green"}

Rows: 2

head()
Details

Syntax head(list)

Description Returns the first element in a LIST<ANY>.

Arguments Name Type Description

list LIST<ANY> A list from which the first


element will be returned.

Returns ANY

Considerations

head(null) returns null.

head([]) returns null.

If the first element in list is null, head(list) will return null.

427
Example 175. head()

Query

MATCH (a)
WHERE a.name = 'Eskil'
RETURN a.likedColors, head(a.likedColors)

The first element in the list is returned.

Result

a.likedColors+ +head(a.likedColors)

["Pink", "Yellow", "Black"] "Pink"

Rows: 1

id() Label—deprecated

 It is recommended to use elementId() instead.

Details

Syntax id(input)

Description Returns the id of a NODE or RELATIONSHIP.

Arguments Name Type Description

input NODE | RELATIONSHIP A node or a relationship.

Returns INTEGER

Considerations

id(null) returns null.

There are important considerations to bear in mind when using id():

1. The function id() returns a node or a relationship identifier, unique by an object type and a database.
Therefore, id() can return the same value for both nodes and relationships in the same database.

2. Neo4j implements the ID so that every node and relationship in a database has an identifier. The
identifier for a node or relationship is guaranteed to be unique among other nodes' and relationships'
identifiers in the same database, within the scope of a single transaction.

3. Neo4j reuses its internal IDs when nodes and relationships are deleted. Applications relying on internal
Neo4j IDs are, as a result, brittle and can be inaccurate. It is therefore recommended to use application-
generated IDs instead.

428
On a composite database, the id() function should be used with caution. It is
recommended to use elementId() instead.

When called in database-specific subqueries, the resulting ID value for a node or


relationship is local to that database. The local ID for nodes or relationships from
 different databases may be the same.

When called from the root context of a query, the resulting value is an extended ID for
the node or relationship. The extended ID is likely different from the local ID for the same
node or relationship.

Example 176. id()

Query

MATCH (a)
RETURN id(a)

The node identifier for each of the nodes is returned.

Result

id(a)

Rows: 5

last()
Details

Syntax last(list)

Description Returns the last element in a LIST<ANY>.

Arguments Name Type Description

list LIST<ANY> A list from which the last


element will be returned.

Returns ANY

Considerations:

last(null) returns null.

last([]) returns null.

429
If the last element in list is null, last(list) will return null.

Example 177. last()

Query

MATCH (a)
WHERE a.name = 'Eskil'
RETURN a.likedColors, last(a.likedColors)

The last element in the list is returned.

Result

a.liked_colors last(a.liked_colors)

["Pink", "Yellow", "Black"] "Black"

Rows: 1

length()
Details

Syntax length(input)

Description Returns the length of a PATH.

Arguments Name Type Description

input PATH A path whose relationships


will be counted.

Returns INTEGER

Considerations

length(null) returns null.

To calculate the length of a LIST of the number of Unicode characters in a STRING, see
 size()

430
Example 178. length()

Query

MATCH p = (a)-->(b)-->(c)
WHERE a.name = 'Alice'
RETURN length(p)

The length of the path p is returned.

Result

length(p)

Rows: 3

nullIf()
Details

Syntax nullIf(v1, v2)

Description Returns null if the two given parameters are equivalent, otherwise returns the value of the
first parameter.

Arguments Name Type Description

v1 ANY A first value to be returned if


the second value is not
equivalent.

v2 ANY A second value against which


the first value is compared.

Returns ANY

This function is the opposite of the coalesce() function, which returns a default value if the given value is
null.

431
Example 179. nullIf()

Query

RETURN nullIf(4, 4)

The null value is returned as the two parameters are equivalent.

Result

nullIf(4, 4)

null

Rows: 1

Example 180. nullIf()

Query

RETURN nullIf("abc", "def")

The first parameter, "abc", is returned, as the two parameters are not equivalent.

Result

nullIf("abc", "def")

"abc"

Rows: 1

432
Example 181. nullIf()

The nullIf() function can be used in conjunction with the coalesce() function for transforming one
data value into another value:

Query

MATCH (a)
RETURN a.name AS name, coalesce(nullIf(a.eyes, "Brown"), "Hazel") AS eyeColor

Result

name eyeColor

"Alice" "Hazel"

"Bob" "Blue"

"Charlie" "Green"

"Daniel" "Hazel"

"Eskil" "Blue"

Rows: 5

properties()
Details

Syntax properties(input)

Description Returns a MAP containing all the properties of a NODE, RELATIONSHIP or MAP.

Arguments Name Type Description

input NODE | RELATIONSHIP | MAP An entity to return the


properties from.

Returns MAP

Considerations

properties(null) returns null.

If input is already a MAP, it is returned unchanged.

433
Example 182. properties()

Query

CREATE (p:Person {name: 'Stefan', city: 'Berlin'})


RETURN properties(p)

Result

properties(p)

{"city": "Berlin", "name": "Stefan"}

Rows: 1

randomUUID()
Details

Syntax randomUUID()

Description Generates a random UUID.

Returns STRING

A Universally Unique Identified (UUID), also known as a Globally Unique Identifier (GUID), is a 128-bit
value with strong guarantees of uniqueness.

Example 183. randomUUID()

Query

RETURN randomUUID() AS uuid

Result

uuid

"9f4c297d-309a-4743-a196-4525b96135c1"

Rows: 1

A randomly-generated UUID is returned.

size()
Details

Syntax size(input)

Description Returns the number of items in a LIST<ANY> or the number of Unicode characters in a STRING.

434
Arguments Name Type Description

input STRING | LIST<ANY> A value whose length is to be


calculated.

Returns INTEGER

 To calculate the length of a PATH, see length().

Considerations

size(null) returns null.

Example 184. size() applied to lists

Query

RETURN size(['Alice', 'Bob'])

Result

size(['Alice', 'Bob'])

Rows: 1

The number of elements in the list is returned.

Example 185. size() applied to pattern comprehensions

Query

MATCH (a)
WHERE a.name = 'Alice'
RETURN size([p=(a)-->()-->() | p]) AS fof

Result

fof

Rows: 1

The number of paths matching the pattern expression is returned. (The size of the list of paths).

435
Example 186. size() applied to strings

Query

MATCH (a)
WHERE size(a.name) > 6
RETURN size(a.name)

Result

size(a.name)

Rows: 1

The number of characters in the STRING 'Charlie' is returned.

startNode()
Details

Syntax startNode(input)

Description Returns the start NODE of a RELATIONSHIP.

Arguments Name Type Description

input RELATIONSHIP A relationship.

Returns NODE

Considerations

startNode(null) returns null.

Example 187. startNode()

Query

MATCH (x:Developer)-[r]-()
RETURN startNode(r)

Result

startNode(r)

{name: "Alice", age: 38, eyes: "Brown"}

{name: "Alice", age: 38, eyes: "Brown"}

Rows: 2

436
timestamp()
Details

Syntax timestamp()

Description Returns the difference, measured in milliseconds, between the current time and midnight,
January 1, 1970 UTC

Returns INTEGER

 It is the equivalent of datetime().epochMillis.

Considerations

timestamp() will return the same value during one entire query, even for long-running queries.

Example 188. timestamp()

Query

RETURN timestamp()

The time in milliseconds is returned.

Result

timestamp()

1655201331965

Rows: 1

toBoolean()
Details

Syntax toBoolean(input)

Description Converts a BOOLEAN, STRING or INTEGER value to a BOOLEAN value. For INTEGER values, 0 is
defined to be false and any other INTEGER is defined to be true.

Arguments Name Type Description

input BOOLEAN | STRING | INTEGER A value to be converted into a


boolean.

Returns BOOLEAN

Considerations

toBoolean(null) returns null.

If input is a BOOLEAN value, it will be returned unchanged.

437
If the parsing fails, null will be returned.

If input is the INTEGER value 0, false will be returned. For any other INTEGER value true will be returned.

This function will return an error if provided with an expression that is not a STRING, INTEGER or BOOLEAN value.

Example 189. toBoolean()

Query

RETURN toBoolean('true'), toBoolean('not a boolean'), toBoolean(0)

Result

toBoolean('true') toBoolean('not a boolean') toBoolean(0)

true <null> false

Rows: 1

toBooleanOrNull()
Details

Syntax toBooleanOrNull(input)

Description Converts a value to a BOOLEAN value, or null if the value cannot be converted.

Arguments Name Type Description

input ANY A value to be converted into a


boolean or null.

Returns BOOLEAN

Considerations

toBooleanOrNull(null) returns null.

If input is a BOOLEAN value, it will be returned unchanged.

If the parsing fails, null will be returned.

If input is the INTEGER value 0, false will be returned. For any other INTEGER value true will be returned.

If the input is not a STRING, INTEGER or BOOLEAN value, null will be returned.

438
Example 190. toBooleanOrNull()

Query

RETURN toBooleanOrNull('true'), toBooleanOrNull('not a boolean'), toBooleanOrNull(0),


toBooleanOrNull(1.5)

Result

toBooleanOrNull('true') toBooleanOrNull('not a toBooleanOrNull(0) toBooleanOrNull(1.5)


boolean')

true <null> false <null>

Rows: 1

toFloat()
Details

Syntax toFloat(input)

Description Converts a STRING, INTEGER or FLOAT value to a FLOAT value.

Arguments Name Type Description

input STRING | INTEGER | FLOAT A value to be converted into a


float.

Returns FLOAT

Considerations

toFloat(null) returns null.

If input is a FLOAT, it will be returned unchanged.

If the parsing fails, null will be returned.

This function will return an error if provided with an expression that is not an INTEGER, FLOAT or a STRING value.

Example 191. toFloat()

Query

RETURN toFloat('11.5'), toFloat('not a number')

Result

toFloat('11.5') toFloat('not a number')

11.5 <null>

Rows: 1

439
toFloatOrNull()
Details

Syntax toFloatOrNull(input)

Description Converts a value to a FLOAT value, or null if the value cannot be converted.

Arguments Name Type Description

input ANY A value to be converted into a


float or null.

Returns FLOAT

Considerations

toFloatOrNull(null) returns null.

If input is a FLOAT, it will be returned unchanged.

If the parsing fails, null will be returned.

If the input is not an INTEGER, FLOAT or a STRING value, null will be returned.

Example 192. toFloatOrNull()

Query

RETURN toFloatOrNull('11.5'), toFloatOrNull('not a number'), toFloatOrNull(true)

Result

toFloatOrNull('11.5') toFloatOrNull('not a number') toFloatOrNull(true)

11.5 <null> <null>

Rows: 1

toInteger()
Details

Syntax toInteger(input)

Description Converts a BOOLEAN, STRING, INTEGER or FLOAT value to an INTEGER value. For BOOLEAN values,
true is defined to be 1 and false is defined to be 0.

Arguments Name Type Description

input BOOLEAN | STRING | INTEGER A value to be converted into


| FLOAT an integer.

Returns INTEGER

Considerations

440
toInteger(null) returns null.

If input is an INTEGER value, it will be returned unchanged.

If the parsing fails, null will be returned.

If input is the boolean value false, 0 will be returned.

If input is the boolean value true, 1 will be returned.

This function will return an error if provided with an expression that is not a BOOLEAN, FLOAT, INTEGER or a STRING value.

Example 193. toInteger()

Query

RETURN toInteger('42'), toInteger('not a number'), toInteger(true)

Result

toInteger('42') toInteger('not a number') toInteger(true)

42 <null> 1

Rows: 1

toIntegerOrNull()
Details

Syntax toIntegerOrNull(input)

Description Converts a value to an INTEGER value, or null if the value cannot be converted.

Arguments Name Type Description

input ANY A value to be converted into


an integer or null.

Returns INTEGER

Considerations

toIntegerOrNull(null) returns null.

If input is an INTEGER value, it will be returned unchanged.

If the parsing fails, null will be returned.

If input is the BOOLEAN value false, 0 will be returned.

If input is the BOOLEAN value true, 1 will be returned.

If the input is not a BOOLEAN, FLOAT, INTEGER or a STRING value, null will be returned.

441
Example 194. toIntegerOrNull()

Query

RETURN toIntegerOrNull('42'), toIntegerOrNull('not a number'), toIntegerOrNull(true),


toIntegerOrNull(['A', 'B', 'C'])

Result

toIntegerOrNull('42') toIntegerOrNull('not a toIntegerOrNull(true) toIntegerOrNull(['A', 'B',


number') 'C'])

42 <null> 1 <null>

Rows: 1

type()
Details

Syntax type(input)

Description Returns a STRING representation of the RELATIONSHIP type.

Arguments Name Type Description

input RELATIONSHIP A relationship.

Returns STRING

Considerations

type(null) returns null.

Example 195. type()

Query

MATCH (n)-[r]->()
WHERE n.name = 'Alice'
RETURN type(r)

The relationship type of r is returned.

Result

type(r)

"KNOWS"

"KNOWS"

Rows: 2

442
valueType() Label—new 5.13
Details

Syntax valueType(input)

Description Returns a STRING representation of the most precise value type that the given expression
evaluates to.

Arguments Name Type Description

input ANY A value to return the type of.

Returns STRING

The output is deterministic and makes use of Type Normalization.

Considerations:

Future releases of Cypher may include updates to the current type system. This can include the
introduction of new types and subtypes of already supported types. If a new type is introduced, it will be
returned by the valueType() function as soon as it is released. However, if a more precise subtype of a
previously supported type is introduced, it would be considered a breaking change. As a result, any new
subtypes introduced after the release of Neo4j 5.13 will not be returned by the valueType() function until
the next major release of Neo4j.

For example, the function currently returns "FLOAT", but if a more specific FLOAT type was added, e.g.
FLOAT32, this would be considered more specific and not be returned until the next major release of Neo4j.
As a result,"FLOAT" would continue to be returned for any FLOAT32 values until the next major release.

With this in mind, the below list contains all supported types (as of Neo4j 5.13) displayed by the
valueType() function until the next major release of Neo4j:

• Predefined types
◦ NOTHING

◦ NULL

◦ BOOLEAN

◦ STRING

◦ INTEGER

◦ FLOAT

◦ DATE

◦ LOCAL TIME

◦ ZONED TIME

◦ LOCAL DATETIME

◦ ZONED DATETIME

◦ DURATION

◦ POINT

443
◦ NODE

◦ RELATIONSHIP

• Constructed types
◦ MAP

◦ LIST<INNER_TYPE> (ordered by the inner type)

◦ PATH

• Dynamic union types


◦ INNER_TYPE_1 \| INNER_TYPE_2… (ordered by specific rules for closed dynamic union type)

◦ ANY

This should be taken into account when relying on the output of the valueType() function.

See the type predicate expression for an alternative way of testing type values.

Example 196. valueType()

Query

UNWIND ["abc", 1, 2.0, true, [date()]] AS value


RETURN valueType(value) AS result

Result

result

"STRING NOT NULL"

"INTEGER NOT NULL"

"FLOAT NOT NULL"

"BOOLEAN NOT NULL"

"LIST<DATE NOT NULL> NOT NULL"

Rows: 5

Spatial functions
Spatial functions are used to specify 2D or 3D POINT values in a Coordinate Reference System (CRS) and
to calculate the geodesic distance between two POINT values.

Example graph
The following graph is used for some of the examples below.

444
TrainStation TRAVEL_ROUTE Office

latitude: 55.672874 latitude: 55.611784


longitude: 12.564590 longitude: 12.994341
city: 'Copenhagen' city: 'Malmö'

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
(copenhagen:TrainStation {latitude: 55.672874, longitude: 12.564590, city: 'Copenhagen'}),
(malmo:Office {latitude: 55.611784, longitude: 12.994341, city: 'Malmö'}),
(copenhagen)-[:TRAVEL_ROUTE]->(malmo)

point()
Details

Syntax point(input)

Description Returns a 2D or 3D point object, given two or respectively three coordinate values in the
Cartesian coordinate system or WGS 84 geographic coordinate system.

Arguments Name Type Description

input MAP Cartesian 2D: {


x :: FLOAT,
y :: FLOAT,
crs = "cartesian" ::
STRING,
srid = 7203 :: INTEGER
}

Cartesian 3D: {
x :: FLOAT,
y :: FLOAT,
z :: FLOAT,
crs = "cartesian-3D" ::
STRING,
srid = 9157 :: INTEGER
}

WGS 84 2D: {
longitude | x :: FLOAT
latitude | y :: FLOAT
crs = "WGS-84-2D" ::
STRING
srid = 4326 :: INTEGER
}

WGS 84 3D: {
longitude | x :: FLOAT,
latitude | y :: FLOAT,
height | z :: FLOAT,
crs = "WGS-84-3D" ::
STRING,
srid = 4979 :: INTEGER
}

445
Returns POINT

Considerations

If any argument provided to point() is null, null will be returned.

If the coordinates are specified using latitude and longitude, the crs or srid fields are optional and inferred to be 'WGS-84'
(srid:4326) for 2D points or 'WGS-84-3D' (srid:4979) for 3D points.

If the coordinates are specified using x and y, then either the crs or srid field is required if a geographic CRS is desired.

If the height/z key and value is not provided, a 2D POINT will be returned in either the WGS 84 or Cartesian CRS, depending
on the coordinate system used.

The crs or srid fields are optional and default to the Cartesian CRS (which means srid:7203) for 2D points or the 3D
Cartesian CRS (which means srid:9157) for 3D points.

Example 197. point() - WGS 84 2D

Query

RETURN point({longitude: 56.7, latitude: 12.78}) AS point

A 2D POINT with a longitude of 56.7 and a latitude of 12.78 in the WGS 84 CRS is returned.

Result

point

point({srid:4326, x:56.7, y:12.78})

Rows: 1

Example 198. point() - WGS 84 2D

Query

RETURN point({x: 2.3, y: 4.5, crs: 'WGS-84'}) AS point

x and y coordinates may be used in the WGS 84 CRS instead of longitude and latitude,
respectively, providing crs is set to 'WGS-84', or srid is set to 4326.

Result

point

point({srid:4326, x:2.3, y:4.5})

Rows: 1

446
Example 199. point() - WGS 84 2D

Query

MATCH (p:Office)
RETURN point({longitude: p.longitude, latitude: p.latitude}) AS officePoint

A 2D POINT representing the coordinates of the city of Malmo in the WGS 84 CRS is returned.

Result

officePoint

point({srid:4326, x:12.994341, y:55.611784})

Rows: 1

Example 200. point() - WGS 84 3D

Query

RETURN point({longitude: 56.7, latitude: 12.78, height: 8}) AS point

A 3D POINT with a longitude of 56.7, a latitude of 12.78 and a height of 8 meters in the WGS 84
CRS is returned.

Result

point

point({srid:4979, x:56.7, y:12.78, z:8.0})

Rows: 1

Example 201. point() - Cartesian 2D

Query

RETURN point({x: 2.3, y: 4.5}) AS point

A 2D POINT with an x coordinate of 2.3 and a y coordinate of 4.5 in the Cartesian CRS is returned.

Result

point

point({srid:7203, x:2.3, y:4.5})

Rows: 1

447
Example 202. point() - Cartesian 3D

Query

RETURN point({x: 2.3, y: 4.5, z: 2}) AS point

A 3D POINT with an x coordinate of 2.3, a y coordinate of 4.5 and a z coordinate of 2 in the Cartesian
CRS is returned.

Result

point

point({srid:9157, x:2.3, y:4.5, z:2.0})

Rows: 1

Example 203. point() - null

Query

RETURN point(null) AS p

If null is provided as the argument, null is returned.

Result

<null>

Rows: 1

point.distance()
Details

Syntax point.distance(from, to)

Description Returns a FLOAT representing the distance between any two points in the same CRS. If the
points are in the WGS 84 CRS, the function returns the geodesic distance (i.e., the shortest
path along the curved surface of the Earth). If the points are in a Cartesian CRS, the function
returns the Euclidean distance (i.e., the shortest straight-line distance in a flat, planar space).

Arguments Name Type Description

from POINT A start point.

to POINT An end point in the same CRS


as the start point.

Returns FLOAT

• If the POINT values are in the Cartesian CRS (2D or 3D), then the units of the returned distance will be

448
the same as the units of the points, calculated using Pythagoras' theorem.

• If the POINT values are in the WGS-84 CRS (2D), then the units of the returned distance will be meters,
based on the haversine formula over a spherical Earth approximation.

• If the POINT values are in the WGS-84 CRS (3D), then the units of the returned distance will be meters.
◦ The distance is calculated in two steps.

▪ First, a haversine formula over a spherical Earth is used, at the average height of the two
points.
▪ To account for the difference in height, Pythagoras' theorem is used, combining the previously
calculated spherical distance with the height difference.
◦ This formula works well for points close to the earth’s surface; for instance, it is well-suited for
calculating the distance of an airplane flight. It is less suitable for greater heights, however, such as
when calculating the distance between two satellites.

Considerations

point.distance(null, null) return null.

point.distance(null, to) return null.

point.distance(from, null) return null.

Attempting to use points with different Coordinate Reference Systems (such as WGS 84 2D and WGS 84 3D) will return
null.

Example 204. point.distance()

Query

WITH
point({x: 2.3, y: 4.5, crs: 'cartesian'}) AS p1,
point({x: 1.1, y: 5.4, crs: 'cartesian'}) AS p2
RETURN point.distance(p1,p2) AS dist

The distance between two 2D points in the Cartesian CRS is returned.

Result

dist

1.5

Rows: 1

449
Example 205. point.distance()

Query

WITH
point({longitude: 12.78, latitude: 56.7, height: 100}) AS p1,
point({latitude: 56.71, longitude: 12.79, height: 100}) AS p2
RETURN point.distance(p1, p2) AS dist

The distance between two 3D points in the WGS 84 CRS is returned.

Result

dist

1269.9148706779097

Rows: 1

Example 206. point.distance()

Query

MATCH (t:TrainStation)-[:TRAVEL_ROUTE]->(o:Office)
WITH
point({longitude: t.longitude, latitude: t.latitude}) AS trainPoint,
point({longitude: o.longitude, latitude: o.latitude}) AS officePoint
RETURN round(point.distance(trainPoint, officePoint)) AS travelDistance

The distance between the train station in Copenhagen and the Neo4j office in Malmo is returned.

Result

travelDistance

27842.0

Rows: 1

Example 207. point.distance()

Query

RETURN point.distance(null, point({longitude: 56.7, latitude: 12.78})) AS d

If null is provided as one or both of the arguments, null is returned.

Result

null

Rows: 1

450
point.withinBBox()
Details

Syntax point.withinBBox(point, lowerLeft, upperRight)

Description Returns true if the provided point is within the bounding box defined by the two provided
points.

Arguments Name Type Description

point POINT A point to be confirmed in the


bounding box.

lowerLeft POINT The lower left side point of


the bounding box.

upperRight POINT The upper right side point of


the bounding box.

Returns BOOLEAN

Considerations

point.withinBBox(point, lowerLeft, upperRight) will return null if any of the arguments evaluate to null.

Attempting to use POINT values with different Coordinate Reference Systems (such as WGS 84 2D and WGS 84 3D) will
return null.

point.withinBBox will handle crossing the 180th meridian in geographic coordinates.

Switching the longitude of the lowerLeft and upperRight in geographic coordinates will switch the direction of the resulting
bounding box.

Switching the latitude of the lowerLeft and upperRight in geographic coordinates so that the former is north of the latter
will result in an empty range.

Example 208. point.withinBBox()

Query

WITH
point({x: 0, y: 0, crs: 'cartesian'}) AS lowerLeft,
point({x: 10, y: 10, crs: 'cartesian'}) AS upperRight
RETURN point.withinBBox(point({x: 5, y: 5, crs: 'cartesian'}), lowerLeft, upperRight) AS result

Checking if a point in Cartesian CRS is contained in the bounding box.

Result

result

true

Rows: 1

451
Example 209. point.withinBBox()

Query

WITH
point({longitude: 12.53, latitude: 55.66}) AS lowerLeft,
point({longitude: 12.614, latitude: 55.70}) AS upperRight
MATCH (t:TrainStation)
WHERE point.withinBBox(point({longitude: t.longitude, latitude: t.latitude}), lowerLeft, upperRight)
RETURN count(t)

Finds all train stations contained in a bounding box around Copenhagen.

Result

count(t)

Rows: 1

Example 210. point.withinBBox()

Query

WITH
point({longitude: 179, latitude: 55.66}) AS lowerLeft,
point({longitude: -179, latitude: 55.70}) AS upperRight
RETURN point.withinBBox(point({longitude: 180, latitude: 55.66}), lowerLeft, upperRight) AS result

A bounding box that crosses the 180th meridian.

Result

result

true

Rows: 1

452
Example 211. point.withinBBox()

Query

RETURN
point.withinBBox(
null,
point({longitude: 56.7, latitude: 12.78}),
point({longitude: 57.0, latitude: 13.0})
) AS in

If null is provided as any of the arguments, null is returned.

Result

in

null

Rows: 1

String functions
String functions operate on string expressions only, and will return an error if used on any other values.
The exception to this rule is toString(), which also accepts numbers, booleans and temporal values (i.e.
DATE, ZONED TIME` LOCAL TIME, ZONED DATETIME, LOCAL DATETIME or DURATION values).

Functions taking a STRING as input all operate on Unicode characters rather than on a standard char[]. For
example, the size() function applied to any Unicode character will return 1, even if the character does not
fit in the 16 bits of one char.

When toString() is applied to a temporal value, it returns a STRING representation

 suitable for parsing by the corresponding temporal functions. This STRING will therefore
be formatted according to the ISO 8601 format.

See also String operators.

btrim() Label—new 5.20


Details

Syntax btrim(input[, trimCharacterString])

Description Returns the given STRING with leading and trailing whitespace removed, optionally specifying
a trimCharacterString to remove.

453
Arguments Name Type Description

input STRING A value from which the


leading and trailing trim
character will be removed.

trimCharacterString STRING A character to be removed


from the start and end of the
given string.

Returns STRING

Considerations

btrim(null) returns null.

btrim(null, null) returns null.

btrim("hello", null) returns null.

btrim(null, ' ') returns null.

If trimCharacterString is not specified then all leading and trailing whitespace will be removed.

Example 212. btrim()

Query

RETURN btrim(' hello '), btrim('xxyyhelloxyxy', 'xy')

Result

btrim(' hello') btrim('xxyyhelloxyxy', 'xy')

"hello" "hello"

Rows: 1

left()
Details

Syntax left(original, length)

Description Returns a STRING containing the specified number (INTEGER) of leftmost characters in the
given STRING.

Arguments Name Type Description

original STRING A string value whose


rightmost characters will be
trimmed.

length INTEGER The length of the leftmost


characters to be returned.

Returns STRING

454
Considerations

left(null, length) return null.

left(null, null) return null.

left(original, null) will raise an error.

If length is not a positive INTEGER, an error is raised.

If length exceeds the size of original, original is returned.

Example 213. left()

Query

RETURN left('hello', 3)

Result

left('hello', 3)

"hel"

Rows: 1

lower() Label—new 5.21


Details

Syntax lower(input)

Description Returns the given STRING in lowercase.

Arguments Name Type Description

input STRING A string to be converted into


lowercase.

Returns STRING

This function is an alias to the toLower() function, and it was introduced as part of Cypher’s GQL
conformance.

Considerations

lower(null) returns null.

455
Example 214. lower()

Query

RETURN lower('HELLO')

Result

lower('HELLO')

"hello"

Rows: 1

ltrim()
Details

Syntax ltrim(input[, trimCharacterString])

Description Returns the given STRING with leading whitespace removed, optionally specifying a
trimCharacterString to remove.

Arguments Name Type Description

input STRING A value from which the


leading trim character will be
removed.

trimCharacterString STRING A character to be removed


from the start of the given
string.

Returns STRING

Considerations

ltrim(null) returns null.

ltrim(null, null) returns null.

ltrim("hello", null) returns null.

ltrim(null, ' ') returns null.

As of Neo4j 5.20, a trimCharacterString can be specified. If this is not specified all leading whitespace will be removed.

456
Example 215. ltrim()

Query

RETURN ltrim(' hello'), ltrim('xxyyhelloxyxy', 'xy')

Result

ltrim(' hello') ltrim('xxyyhelloxyxy', 'xy')

"hello" "helloxyxy"

Rows: 1

normalize() Label—new 5.17


Details

Syntax normalize(input[, normalForm])

Description Normalize a STRING, optionally specifying a normalization form.

Arguments Name Type Description

input STRING A value to be normalized.

normalForm [NFC, NFD, NFKC, NFKD] A keyword specifying any of


the normal forms; NFC, NFD,
NFKC or NFKD.

Returns STRING

Unicode normalization is a process that transforms different representations of the same

 string into a standardized form. For more information, see the documentation for
Unicode normalization forms.

The normalize() function is useful for converting STRING values into comparable forms. When comparing
two STRING values, it is their Unicode codepoints that are compared. In Unicode, a codepoint for a
character that looks the same may be represented by two, or more, different codepoints. For example, the
character < can be represented as \uFE64 (﹤) or \u003C (<). To the human eye, the characters may appear
identical. However, if compared, Cypher will return false as \uFE64 does not equal \u003C. Using the
normalize() function, it is possible to normalize the codepoint \uFE64 to \u003C, creating a single
codepoint representation, allowing them to be successfully compared.

Considerations

normalize(null) returns null.

457
Example 216. normalize()

Query

RETURN normalize('\u212B') = '\u00C5' AS result

Result

result

true

Rows: 1

To check if a STRING is normalized, use the IS NORMALIZED operator.

normalize() with specified normal form


There are two main types of normalization forms:

• Canonical equivalence: The NFC (default) and NFD are forms of canonical equivalence. This means that
codepoints that represent the same abstract character will be normalized to the same codepoint (and
have the same appearance and behavior). The NFC form will always give the composed canonical form
(in which the combined codes are replaced with a single representation, if possible). The`NFD` form
gives the decomposed form (the opposite of the composed form, which converts the combined
codepoints into a split form if possible).

• Compatability normalization: NFKC and NFKD are forms of compatibility normalization. All canonically
equivalent sequences are compatible, but not all compatible sequences are canonical. This means that
a character normalized in NFC or NFD should also be normalized in NFKC and NFKD. Other characters with
only slight differences in appearance should be compatibly equivalent.

For example, the Greek Upsilon with Acute and Hook Symbol ϓ can be represented by the Unicode
codepoint: \u03D3.

• Normalized in NFC: \u03D3 Greek Upsilon with Acute and Hook Symbol (ϓ)

• Normalized in NFD: \u03D2\u0301 Greek Upsilon with Hook Symbol + Combining Acute Accent (ϓ)

• Normalized in NFKC: \u038E Greek Capital Letter Upsilon with Tonos (Ύ)

• Normalized in NFKD: \u03A5\u0301 Greek Capital Letter Upsilon + Combining Acute Accent (Ύ)

In the compatibility normalization forms (NFKC and NFKD) the character is visibly different as it no longer
contains the hook symbol.

458
Example 217. normalize() with specified normalization form

Query

RETURN normalize('\uFE64', NFKC) = '\u003C' AS result

Result

result

true

Rows: 1

replace()
Details

Syntax replace(original, search, replace)

Description Returns a STRING in which all occurrences of a specified search STRING in the given STRING
have been replaced by another (specified) replacement STRING.

Arguments Name Type Description

original STRING The string to be modified.

search STRING The value to replace in the


original string.

replace STRING The value to be inserted in


the original string.

Returns STRING

Considerations

If any argument is null, null will be returned.

If search is not found in original, original will be returned.

Example 218. replace()

Query

RETURN replace("hello", "l", "w")

Result

replace("hello", "l", "w")

"hewwo"

Rows: 1

459
reverse()
Details

Syntax reverse(input)

Description Returns a STRING or LIST<ANY> in which the order of all characters or elements in the given
STRING or LIST<ANY> have been reversed.

Arguments Name Type Description

input STRING | LIST<ANY> The string or list to be


reversed.

Returns STRING | LIST<ANY>

Considerations

reverse(null) returns null.

See also List functions → reverse.

Example 219. reverse

Query

RETURN reverse('palindrome')

Result

reverse('palindrome')

"emordnilap"

Rows: 1

right()
Details

Syntax right(original, length)

Description Returns a STRING containing the specified number of rightmost characters in the given
STRING.

Arguments Name Type Description

original STRING A string value whose leftmost


characters will be trimmed.

length INTEGER The length of the rightmost


characters to be returned.

Returns STRING

Considerations

460
right(null, length) return null.

right(null, null) return null.

right(original, null) will raise an error.

If length is not a positive INTEGER, an error is raised.

If length exceeds the size of original, original is returned.

Example 220. right()

Query

RETURN right('hello', 3)

Result

right('hello', 3)

"llo"

Rows: 1

rtrim()
Details

Syntax rtrim(input[, trimCharacterString])

Description Returns the given STRING with trailing whitespace characters to remove, optionally specifying
a trimCharacterString of characters to remove.

Arguments Name Type Description

input STRING A value from which the


leading and trailing trim
character will be removed.

trimCharacterString STRING A character to be removed


from the start and end of the
given string.

Returns STRING

Considerations

rtrim(null) returns null.

rtrim(null, null) returns null.

rtrim("hello", null) returns null.

rtrim(null, ' ') returns null.

As of Neo4j 5.20, a trimCharacterString can be specified. If this is not specified all trailing whitespace will be removed.

461
Example 221. rtrim()

Query

RETURN rtrim('hello '), rtrim('xxyyhelloxyxy', 'xy')

Result

rtrim('hello ') rtrim('xxyyhelloxyxy', 'xy')

"hello" "xxyyhello"

Rows: 1

split()
Details

Syntax split(original, splitDelimiters)

Description Returns a LIST<STRING> resulting from the splitting of the given STRING around matches of
the given delimiter(s).

Arguments Name Type Description

original STRING The string to be split.

splitDelimiters STRING | LIST<STRING> The string with which to split


the original string.

Returns LIST<STRING>

Considerations

split(null, splitDelimiter) return null.

split(original, null) return null

Example 222. split()

Query

RETURN split('one,two', ',')

Result

split('one,two', ',')

["one","two"]

Rows: 1

462
substring()
Details

Syntax substring(original, start, length)

Description Returns a substring of a given length from the given STRING, beginning with a 0-based index
start.

Arguments Name Type Description

original STRING The string to be shortened.

start INTEGER The start position of the new


string.

length INTEGER The length of the new string.

Returns STRING

Considerations

start uses a zero-based index.

If length is omitted, the function returns the substring starting at the position given by start and extending to the end of
original.

If original is null, null is returned.

If either start or length is null or a negative integer, an error is raised.

If start is 0, the substring will start at the beginning of original.

If length is 0, the empty STRING will be returned.

Example 223. substring()

Query

RETURN substring('hello', 1, 3), substring('hello', 2)

Result

substring('hello', 1, 3) substring('hello', 2)

"ell" "llo"

Rows: 1

toLower()
Details

Syntax toLower(input)

Description Returns the given STRING in lowercase.

463
Arguments Name Type Description

input STRING A string to be converted into


lowercase.

Returns STRING

Considerations

toLower(null) returns null.

Example 224. toLower()

Query

RETURN toLower('HELLO')

Result

toLower('HELLO')

"hello"

Rows: 1

toString()
Details

Syntax toString(input)

Description Converts an INTEGER, FLOAT, BOOLEAN, POINT or temporal type (i.e. DATE, ZONED TIME, LOCAL
TIME, ZONED DATETIME, LOCAL DATETIME or DURATION) value to a STRING.

Arguments Name Type Description

input ANY A value to be converted into a


string.

Returns STRING

Considerations

toString(null) returns null.

If input is a STRING, it will be returned unchanged.

This function will return an error if provided with an expression that is not an INTEGER, FLOAT, BOOLEAN, STRING, POINT,
DURATION, DATE, ZONED TIME, LOCAL TIME, LOCAL DATETIME or ZONED DATETIME value.

464
Example 225. toString()

Query

RETURN
toString(11.5),
toString('already a string'),
toString(true),
toString(date({year: 1984, month: 10, day: 11})) AS dateString,
toString(datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14, millisecond:
341, timezone: 'Europe/Stockholm'})) AS datetimeString,
toString(duration({minutes: 12, seconds: -60})) AS durationString

Result

toString(11.5) toString('already toString(true) dateString datetimeString durationString


a string')

"11.5" "already a "true" "1984-10-11" "1984-10- "PT11M"


string" 11T12:31:14.341+
01:00[Europe/Sto
ckholm]"

Rows: 1

toStringOrNull()
Details

Syntax toStringOrNull(input)

Description Converts an INTEGER, FLOAT, BOOLEAN, POINT or temporal type (i.e. DATE, ZONED TIME, LOCAL
TIME, ZONED DATETIME, LOCAL DATETIME or DURATION) value to a STRING, or null if the value
cannot be converted.

Arguments Name Type Description

input ANY A value to be converted into a


string or null.

Returns STRING

Considerations

toStringOrNull(null) returns null.

If the input is not an INTEGER, FLOAT, BOOLEAN, STRING, POINT, DURATION, DATE, ZONED TIME, LOCAL TIME, LOCAL DATETIME or
ZONED DATETIME value, null will be returned.

465
Example 226. toStringOrNull()

Query

RETURN toStringOrNull(11.5),
toStringOrNull('already a string'),
toStringOrNull(true),
toStringOrNull(date({year: 1984, month: 10, day: 11})) AS dateString,
toStringOrNull(datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14,
millisecond: 341, timezone: 'Europe/Stockholm'})) AS datetimeString,
toStringOrNull(duration({minutes: 12, seconds: -60})) AS durationString,
toStringOrNull(['A', 'B', 'C']) AS list

Result

toStringOrNull( toStringOrNull( toStringOrNull( dateString datetimeString durationString list


11.5) 'already a true)
string')

"11.5" "already a "true" "1984-10-11" "1984-10- "PT11M" <null>


string" 11T12:31:14.34
1+01:00[Europe
/Stockholm]"

Rows: 1

toUpper()
Details

Syntax toUpper(input)

Description Returns the given STRING in uppercase.

Arguments Name Type Description

input STRING A string to be converted into


uppercase.

Returns STRING

Considerations

toUpper(null) returns null.

466
Example 227. toUpper()

Query

RETURN toUpper('hello')

Result

toUpper('hello')

"HELLO"

Rows: 1

trim()
Details

Syntax trim(trimSpecification, trimCharacterString, input)

Description Returns the given STRING with leading and/or trailing trimCharacterString removed.

Arguments Name Type Description

trimSpecification [LEADING, TRAILING, BOTH] The parts of the string to trim;


LEADING, TRAILING, BOTH

trimCharacterString STRING The characters to be removed


from the start and/or end of
the given string.

input STRING A value from which all leading


and/or trailing trim characters
will be removed.

Returns STRING

Considerations

trim(null) returns null.

trim(null FROM "hello") returns null.

trim(" " FROM null) returns null.

trim(BOTH null FROM null) returns null.

As of Neo4j 5.20, a trimSpecification and a trimCharacterString can be specified. If these are not specified all leading
and/or trailing whitespace will be removed.

467
Example 228. trim()

Query

RETURN trim(' hello '), trim(BOTH 'x' FROM 'xxxhelloxxx')

Result

trim(' hello ') trim(BOTH 'x' FROM 'xxxhelloxxx')

"hello" "hello"

Rows: 1

upper() Label—new 5.21


Details

Syntax upper(input)

Description Returns the given STRING in uppercase.

Arguments Name Type Description

input STRING A string to be converted into


uppercase.

Returns STRING

This function is an alias to the toUpper() function, and it was introduced as part of Cypher’s GQL
conformance.

Considerations

upper(null) returns null.

Example 229. upper()

Query

RETURN upper('hello')

Result

upper('hello')

"HELLO"

Rows: 1

Temporal functions - duration


Duration functions allow for the creation and manipulation of temporal DURATION values.

468
 See also Temporal values and Temporal operators.

duration()
Details

Syntax duration(input)

Description Creates a DURATION value.

Arguments Name Type Description

input ANY A map optionally containing


the following keys: 'years',
'quarters', 'months', 'weeks',
'days', 'hours', 'minutes',
'seconds', 'milliseconds',
'microseconds', or
'nanoseconds'.

Returns DURATION

Considerations

At least one parameter must be provided (duration() and duration({}) are invalid).

There is no constraint on how many of the parameters are provided.

It is possible to have a DURATION where the amount of a smaller unit (e.g. seconds) exceeds the threshold of a larger unit (e.g.
days).

The values of the parameters may be expressed as decimal fractions.

The values of the parameters may be arbitrarily large.

The values of the parameters may be negative.

The components of DURATION objects are individually accessible.

469
Example 230. duration() using duration components

Query

UNWIND [
duration({days: 14, hours:16, minutes: 12}),
duration({months: 5, days: 1.5}),
duration({months: 0.75}),
duration({weeks: 2.5}),
duration({minutes: 1.5, seconds: 1, milliseconds: 123, microseconds: 456, nanoseconds: 789}),
duration({minutes: 1.5, seconds: 1, nanoseconds: 123456789})
] AS aDuration
RETURN aDuration

Result

aDuration

P14DT16H12M

P5M1DT12H

P22DT19H51M49.5S

P17DT12H

PT1M31.123456789S

PT1M31.123456789S

Rows: 6

Example 231. duration() using STRING values

Query

UNWIND [
duration("P14DT16H12M"),
duration("P5M1.5D"),
duration("P0.75M"),
duration("PT0.75M"),
duration("P2012-02-02T14:37:21.545")
] AS aDuration
RETURN aDuration

Result

aDuration

P14DT16H12M

P5M1DT12H

P22DT19H51M49.5S

PT45S

P2012Y2M2DT14H37M21.545S

Rows: 5

470
duration.between()
Details

Syntax duration.between(from, to)

Description Computes the DURATION between the from instant (inclusive) and the to instant (exclusive) in
logical units.

Arguments Name Type Description

from ANY A temporal instant type (DATE,


LOCAL TIME, ZONED TIME,
LOCAL DATETIME, ZONED
DATETIME) representing the
starting instant.

to ANY A temporal instant type (DATE,


LOCAL TIME, ZONED TIME,
LOCAL DATETIME, ZONED
DATETIME) representing the
ending instant.

Returns DURATION

Considerations

If to occurs earlier than from, the resulting DURATION will be negative.

If from has a time component and to does not, the time component of to is assumed to be midnight, and vice versa.

If from has a time zone component and to does not, the time zone component of to is assumed to be the same as that of
from, and vice versa.

If to has a date component and from does not, the date component of from is assumed to be the same as that of to, and vice
versa.

471
Example 232. duration.between()

Query

UNWIND [
duration.between(date("1984-10-11"), date("1985-11-25")),
duration.between(date("1985-11-25"), date("1984-10-11")),
duration.between(date("1984-10-11"), datetime("1984-10-12T21:40:32.142+0100")),
duration.between(date("2015-06-24"), localtime("14:30")),
duration.between(localtime("14:30"), time("16:30+0100")),
duration.between(localdatetime("2015-07-21T21:40:32.142"), localdatetime("2016-07-21T21:45:22.142")),
duration.between(datetime({year: 2017, month: 10, day: 29, hour: 0, timezone: 'Europe/Stockholm'}),
datetime({year: 2017, month: 10, day: 29, hour: 0, timezone: 'Europe/London'}))
] AS aDuration
RETURN aDuration

Result

aDuration

P1Y1M14D

P-1Y-1M-14D

P1DT21H40M32.142S

PT14H30M

PT2H

P1YT4M50S

PT1H

Rows: 7

duration.inDays()
Details

Syntax duration.inDays(from, to)

Description Computes the DURATION between the from instant (inclusive) and the to instant (exclusive) in
days.

Arguments Name Type Description

from ANY A temporal instant type (DATE,


LOCAL TIME, ZONED TIME,
LOCAL DATETIME, ZONED
DATETIME) representing the
starting instant.

to ANY A temporal instant type (DATE,


LOCAL TIME, ZONED TIME,
LOCAL DATETIME, ZONED
DATETIME) representing the
ending instant.

Returns DURATION

Considerations

472
If to occurs earlier than from, the resulting DURATION will be negative.

If from has a time component and to does not, the time component of to is assumed to be midnight, and vice versa.

If from has a time zone component and to does not, the time zone component of to is assumed to be the same as that of
from, and vice versa.

If from has a date component and to does not, the date component of to is assumed to be the same as that of from, and vice
versa.

Any difference smaller than a whole day is disregarded.

Get the total number of days in a DURATION by returning the days component. For more information, see Components of
durations.

Example 233. duration.inDays()

Query

UNWIND [
duration.inDays(date("1984-10-11"), date("1985-11-25")),
duration.inDays(date("1985-11-25"), date("1984-10-11")),
duration.inDays(date("1984-10-11"), datetime("1984-10-12T21:40:32.142+0100")),
duration.inDays(date("2015-06-24"), localtime("14:30")),
duration.inDays(localdatetime("2015-07-21T21:40:32.142"), localdatetime("2016-07-21T21:45:22.142")),
duration.inDays(datetime({year: 2017, month: 10, day: 29, hour: 0, timezone: 'Europe/Stockholm'}),
datetime({year: 2017, month: 10, day: 29, hour: 0, timezone: 'Europe/London'}))
] AS aDuration
RETURN aDuration

Result

aDuration

P410D

P-410D

P1D

PT0S

P366D

PT0S

Rows: 6

duration.inMonths()
Details

Syntax duration.inMonths(from, to)

Description Computes the DURATION between the from instant (inclusive) and the to instant (exclusive) in
months.

473
Arguments Name Type Description

from ANY A temporal instant type (DATE,


LOCAL TIME, ZONED TIME,
LOCAL DATETIME, ZONED
DATETIME) representing the
starting instant.

to ANY A temporal instant type (DATE,


LOCAL TIME, ZONED TIME,
LOCAL DATETIME, ZONED
DATETIME) representing the
ending instant.

Returns DURATION

Considerations

If to occurs earlier than from, the resulting DURATION will be negative.

If from has a time component and to does not, the time component of to is assumed to be midnight, and vice versa.

If from has a time zone component and to does not, the time zone component of to is assumed to be the same as that of
from, and vice versa.

If from has a date component and to does not, the date component of to is assumed to be the same as that of from, and vice
versa.

Any difference smaller than a whole month is disregarded.

Get the total number of months in a DURATION by returning the months component. For more information, see Components of
durations.

474
Example 234. duration.inMonths()

Query

UNWIND [
duration.inMonths(date("1984-10-11"), date("1985-11-25")),
duration.inMonths(date("1985-11-25"), date("1984-10-11")),
duration.inMonths(date("1984-10-11"), datetime("1984-10-12T21:40:32.142+0100")),
duration.inMonths(date("2015-06-24"), localtime("14:30")),
duration.inMonths(localdatetime("2015-07-21T21:40:32.142"), localdatetime("2016-07-
21T21:45:22.142")),
duration.inMonths(datetime({year: 2017, month: 10, day: 29, hour: 0, timezone: 'Europe/Stockholm'}),
datetime({year: 2017, month: 10, day: 29, hour: 0, timezone: 'Europe/London'}))
] AS aDuration
RETURN aDuration

Result

aDuration

P1Y1M

P-1Y-1M

PT0S

PT0S

P1Y

PT0S

Rows: 6

duration.inSeconds()
Details

Syntax duration.inSeconds(from, to)

Description Computes the DURATION between the from instant (inclusive) and the to instant (exclusive) in
seconds.

Arguments Name Type Description

from ANY A temporal instant type (DATE,


LOCAL TIME, ZONED TIME,
LOCAL DATETIME, ZONED
DATETIME) representing the
starting instant.

to ANY A temporal instant type (DATE,


LOCAL TIME, ZONED TIME,
LOCAL DATETIME, ZONED
DATETIME) representing the
ending instant.

Returns DURATION

Considerations

475
If to occurs earlier than from, the resulting DURATION will be negative.

If from has a time component and to does not, the time component of to is assumed to be midnight, and vice versa.

If from has a time zone component and to does not, the time zone component of to is assumed to be the same as that of
from, and vice versa.

If from has a date component and to does not, the date component of to is assumed to be the same as that of from, and vice
versa.

Get the total seconds of days in a DURATION by returning the seconds component. For more information, see Components of
durations.

Example 235. duration.inSeconds()

Query

UNWIND [
duration.inSeconds(date("1984-10-11"), date("1984-10-12")),
duration.inSeconds(date("1984-10-12"), date("1984-10-11")),
duration.inSeconds(date("1984-10-11"), datetime("1984-10-12T01:00:32.142+0100")),
duration.inSeconds(date("2015-06-24"), localtime("14:30")),
duration.inSeconds(datetime({year: 2017, month: 10, day: 29, hour: 0, timezone: 'Europe/Stockholm'}),
datetime({year: 2017, month: 10, day: 29, hour: 0, timezone: 'Europe/London'}))
] AS aDuration
RETURN aDuration

Result

aDuration

PT24H

PT-24H

PT25H32.142S

PT14H30M

PT1H

Rows: 5

Temporal functions - instant types


Temporal functions allow for the creation and manipulation of values for each temporal type — DATE, ZONED
TIME, LOCAL TIME, ZONED DATETIME, and LOCAL DATETIME.

The following functions are included on this page:

DATE ZONED DATETIME LOCAL DATETIME LOCAL TIME ZONED TIME

date() datetime() localdatetime() localtime() time()

datetime.fromEpoch()

datetime.fromEpochMill
is()

date.realtime() datetime.realtime() localdatetime.realtime() localtime.realtime() time.realtime()

476
DATE ZONED DATETIME LOCAL DATETIME LOCAL TIME ZONED TIME

date.statement() datetime.statement() localdatetime.statemen localtime.statement() time.statement()


t()

date.transaction() datetime.transaction() localdatetime.transactio localtime.transaction() time.transaction()


n()

 See also Temporal (Date/Time) values and Temporal operators.

Temporal instant types

An overview of temporal instant type creation


Each function bears the same name as the type, and construct the type they correspond to in one of four
ways:

• Capturing the current time.

• Composing the components of the type.

• Parsing a STRING representation of the temporal value.

• Selecting and composing components from another temporal value by


◦ either combining temporal values (such as combining a DATE with a ZONED TIME to create a ZONED
DATETIME), or
◦ selecting parts from a temporal value (such as selecting the DATE from a ZONED DATETIME); the
extractors — groups of components which can be selected — are:
▪ date — contains all components for a DATE (conceptually year, month and day).

▪ time — contains all components for a ZONED TIME (hour, minute, second, and sub-seconds;
namely millisecond, microsecond and nanosecond). If the type being created and the type from
which the time component is being selected both contain timezone (and a timezone is not
explicitly specified) the timezone is also selected.
▪ datetime — selects all components, and is useful for overriding specific components.
Analogously to time, if the type being created and the type from which the time component is
being selected both contain timezone (and a timezone is not explicitly specified) the timezone is
also selected.
◦ In effect, this allows for the conversion between different temporal types, and allowing for
'missing' components to be specified.

Temporal instant type creation functions

Function DATE ZONED TIME LOCAL TIME ZONED DATETIME LOCAL DATETIME

Getting the current     


value.

477
Function DATE ZONED TIME LOCAL TIME ZONED DATETIME LOCAL DATETIME

Creating a   
calendar-based
(Year-Month-Day)
value.

Creating a week-   
based (Year-
Week-Day) value.

Creating a quarter-   
based (Year-
Quarter-Day)
value.

Creating an ordinal   
(Year-Day) value.

Creating a value  
from time
components.

Creating a value     
from other
temporal values
using extractors
(i.e. converting
between different
types).

Creating a value     
from a STRING.

Creating a value 
from a timestamp.

All the temporal instant types — including those that do not contain timezone
information support such as DATE, LOCAL TIME and ZONED DATETIME — allow for a
 timezone to specified for the functions that retrieve the current instant. This allows for
the retrieval of the current instant in the specified timezone.

Controlling which clock to use


The functions which create temporal instant values based on the current instant use the statement clock
as default. However, there are three different clocks available for more fine-grained control:

• transaction: The same instant is produced for each invocation within the same transaction. A different
time may be produced for different transactions.

• statement: The same instant is produced for each invocation within the same statement. A different
time may be produced for different statements within the same transaction.

• realtime: The instant produced will be the live clock of the system.

The following table lists the different sub-functions for specifying the clock to be used when creating the

478
current temporal instant value:

Type default transaction statement realtime

Date date() date.transaction() date.statement() date.realtime()

Time time() time.transaction() time.statement() time.realtime()

LocalTime localtime() localtime.transaction() localtime.statement() localtime.realtime()

DateTime datetime() datetime.transaction() datetime.statement() datetime.realtime()

LocalDateTime localdatetime() localdatetime.transactio localdatetime.statemen localdatetime.realtime()


n() t()

Truncating temporal values


A temporal instant value can be created by truncating another temporal instant value at the nearest
preceding point in time at a specified component boundary (namely, a truncation unit). A temporal instant
value created in this way will have all components which are smaller than the specified truncation unit set
to their default values.

It is possible to supplement the truncated value by providing a map containing components which are
smaller than the truncation unit. This will have the effect of overriding the default values which would
otherwise have been set for these smaller components.

The following table lists the supported truncation units and the corresponding sub-functions:

Truncation unit DATE ZONED TIME LOCAL TIME ZONED DATETIME LOCAL DATETIME

millennium date.truncate('mille datetime.truncate(' localdatetime.trunc


nnium', input) millennium', input) ate('millennium',
input)

century date.truncate('cent datetime.truncate(' localdatetime.trunc


ury', input) century', input) ate('century', input)

decade date.truncate('deca datetime.truncate(' localdatetime.trunc


de', input) decade', input) ate('decade', input)

year date.truncate('year' datetime.truncate(' localdatetime.trunc


, input) year', input) ate('year', input)

weekYear date.truncate('wee datetime.truncate(' localdatetime.trunc


kYear', input) weekYear', input) ate('weekYear',
input)

quarter date.truncate('quar datetime.truncate(' localdatetime.trunc


ter', input) quarter', input) ate('quarter', input)

month date.truncate('mon datetime.truncate(' localdatetime.trunc


th', input) month', input) ate('month', input)

week date.truncate('wee datetime.truncate(' localdatetime.trunc


k', input) week', input) ate('week', input)

day date.truncate('day', time.truncate('day', localtime.truncate(' datetime.truncate(' localdatetime.trunc


input) input) day', input) day', input) ate('day', input)

479
Truncation unit DATE ZONED TIME LOCAL TIME ZONED DATETIME LOCAL DATETIME

hour time.truncate('hour' localtime.truncate(' datetime.truncate(' localdatetime.trunc


, input) hour', input) hour', input) ate('hour',input)

minute time.truncate('minu localtime.truncate(' datetime.truncate(' localdatetime.trunc


te', input) minute', input) minute', input) ate('minute', input)

second time.truncate('seco localtime.truncate(' datetime.truncate(' localdatetime.trunc


nd', input) second', input) second', input) ate('second', input)

millisecond time.truncate('millis localtime.truncate(' datetime.truncate(' localdatetime.trunc


econd', input) millisecond', input) millisecond', input) ate('millisecond',
input)

microsecond time.truncate('micr localtime.truncate(' datetime.truncate(' localdatetime.trunc


osecond', input) microsecond', microsecond', ate('microsecond',
input) input) input)

date()
Details

Syntax date( [input] )

Description Creates a DATE instant.

Arguments Name Type Description

input ANY Either a string representation


of a temporal value, a map
containing the single key
'timezone', or a map
containing temporal values
('date', 'year', 'month', 'day',
'week', 'dayOfWeek',
'quarter', 'dayOfQuarter',
'ordinalDay') as components.

Returns DATE

Temporal components

Name Description

date A DATE value.

year An expression consisting of at least four digits that specifies


the year.

month An integer between 1 and 12 that specifies the month.

day An integer between 1 and 31 that specifies the day of the


month.

week An integer between 1 and 53 that specifies the week.

dayOfWeek An integer between 1 and 7 that specifies the day of the


week.

480
Name Description

quarter An integer between 1 and 4 that specifies the quarter.

dayOfQuarter An integer between 1 and 92 that specifies the day of the


quarter.

ordinalDay An integer between 1 and 366 that specifies the ordinal day
of the year.

Considerations

If no parameters are provided, date() must be invoked (date({}) is invalid).

If no timezone is specified, the local timezone will be used.

The day of the month component will default to 1 if day is omitted.

The month component will default to 1 if month is omitted.

If month is omitted, day must also be omitted.

The day of the week component will default to 1 if dayOfWeek is omitted.

The week component will default to 1 if week is omitted.

If week is omitted, dayOfWeek must also be omitted.

The day of the quarter component will default to 1 if dayOfQuarter is omitted.

The quarter component will default to 1 if quarter is omitted.

If quarter is omitted, dayOfQuarter must also be omitted.

The ordinal day of the year component will default to 1 if ordinalDay is omitted.

String representations of temporal values must comply with the format defined for dates.

String representations of temporal values must denote a valid date; i.e. a temporal value denoting 30 February 2001 is
invalid.

date(null) returns null.

If any of the optional parameters are provided, these will override the corresponding components of date.

date(dd) may be written instead of date({date: dd}).

481
Example 236. date() to get the current time (no parameters provided)

Query

RETURN date() AS currentDate

The current date is returned.

Result

currentDate

2022-06-14

Rows: 1

Example 237. date() with provided timezone

Query

RETURN date({timezone: 'America/Los Angeles'}) AS currentDateInLA

The current date in California is returned.

Result

currentDateInLA

2022-06-14

Rows: 1

Creating DATE values

482
Example 238. date() - Creating a calendar (Yeay-Month-Day) DATE

Query

UNWIND [
date({year: 1984, month: 10, day: 11}),
date({year: 1984, month: 10}),
date({year: 1984})
] AS theDate
RETURN theDate

Result

theDate

1984-10-11

1984-10-01

1984-01-01

Rows: 3

Example 239. date() - Creating a week (Year-Week-Day) DATE

Query

UNWIND [
date({year: 1984, week: 10, dayOfWeek: 3}),
date({year: 1984, week: 10}),
date({year: 1984})
] AS theDate
RETURN theDate

Result

theDate

1984-03-07

1984-03-05

1984-01-01

Rows: 3

483
Example 240. date() - Creating a quarter (Year-Quarter-Day) DATE

Query

UNWIND [
date({year: 1984, quarter: 3, dayOfQuarter: 45}),
date({year: 1984, quarter: 3}),
date({year: 1984})
] AS theDate
RETURN theDate

Result

theDate

1984-08-14

1984-07-01

1984-01-01

Rows: 3

Example 241. date() - Creating an ordinal (Year-Day) DATE

Query

UNWIND [
date({year: 1984, ordinalDay: 202}),
date({year: 1984})
] AS theDate
RETURN theDate

The date corresponding to 11 February 1984 is returned.

Result

theDate

1984-07-20

1984-01-01

Rows: 2

484
Example 242. date() - Creating a DATE using other temporal values as components

Query

UNWIND [
date({year: 1984, month: 11, day: 11}),
localdatetime({year: 1984, month: 11, day: 11, hour: 12, minute: 31, second: 14}),
datetime({year: 1984, month: 11, day: 11, hour: 12, timezone: '+01:00'})
] AS dd
RETURN date({date: dd}) AS dateOnly, date({date: dd, day: 28}) AS dateDay

Result

dateOnly +dateDay

1984-11-11 1984-11-28

1984-11-11 1984-11-28

1984-11-11 1984-11-28

Rows: 3

Example 243. date() - Creating a DATE from a STRING

Query

UNWIND [
date('2015-07-21'),
date('2015-07'),
date('201507'),
date('2015-W30-2'),
date('2015202'),
date('2015')
] AS theDate
RETURN theDate

Result

theDate

2015-07-21

2015-07-01

2015-07-01

2015-07-21

2015-07-21

2015-01-01

Rows: 6

date.realtime()
Details

Syntax date.realtime([ timezone ])

485
Description Returns the current DATE instant using the realtime clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns DATE

This returned DATE will be the live clock of the system.

Example 244. date.realtime()

Query

RETURN date.realtime() AS currentDate

Result

currentDate

2022-06-14

Rows: 1

Example 245. date.realtime()

Query

RETURN date.realtime('America/Los Angeles') AS currentDateInLA

Result

currentDateInLA

2022-06-14

Rows: 1

date.statement()
Details

Syntax date.statement([ timezone ])

Description Returns the current DATE instant using the statement clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns DATE

This returned DATE will be the same for each invocation within the same statement. However, a different

486
value may be produced for different statements within the same transaction.

Example 246. date.statement()

Query

RETURN date.statement() AS currentDate

Result

currentDate

2022-06-14

Rows: 1

date.transaction()
Details

Syntax date.transaction([ timezone ])

Description Returns the current DATE instant using the transaction clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns DATE

The returned DATE will be the same for each invocation within the same transaction. However, a different
value may be produced for different transactions.

Example 247. date.transaction()

Query

RETURN date.transaction() AS currentDate

Result

currentDate

2022-06-14

Rows: 1

date.truncate()
Details

Syntax date.truncate(unit [, input, fields])

487
Description Truncates the given temporal value to a DATE instant using the specified unit.

Arguments Name Type Description

unit STRING A string representing one of


the following: 'day', 'week',
'month', 'weekYear', 'quarter',
'year', 'decade', 'century',
'millennium'.

input ANY The date to be truncated


using either ZONED DATETIME,
LOCAL DATETIME, or DATE.

fields MAP A list of time components


smaller than those specified
in unit to preserve during
truncation.

Returns DATE

date.truncate() returns the DATE value obtained by truncating a specified temporal instant value at the
nearest preceding point in time at the specified component boundary (which is denoted by the truncation
unit passed as a parameter to the function). In other words, the DATE returned will have all components
that are smaller than the specified truncation unit set to their default values.

It is possible to supplement the truncated value by providing a map containing components which are
smaller than the truncation unit. This will have the effect of overriding the default values which would
otherwise have been set for these smaller components. For example, day — with some value x — may be
provided when the truncation unit STRING is 'year' in order to ensure the returned value has the day set to
x instead of the default day (which is 1).

Considerations

Any component that is provided in fields must be smaller than unit; i.e. if unit STRING is 'day', fields cannot contain
information pertaining to a month.

Any component that is not contained in fields and which is smaller than unit will be set to its minimal value.

If fields is not provided, all components of the returned value which are smaller than unit will be set to their default values.

If input is not provided, it will be set to the current date, i.e. date.truncate(unit) is equivalent of date.truncate(unit,
date()).

488
Example 248. date.truncate()

Query

WITH
datetime({
year: 2017, month: 11, day: 11,
hour: 12, minute: 31, second: 14, nanosecond: 645876123,
timezone: '+01:00'
}) AS d
RETURN
date.truncate('millennium', d) AS truncMillenium,
date.truncate('century', d) AS truncCentury,
date.truncate('decade', d) AS truncDecade,
date.truncate('year', d, {day: 5}) AS truncYear,
date.truncate('weekYear', d) AS truncWeekYear,
date.truncate('quarter', d) AS truncQuarter,
date.truncate('month', d) AS truncMonth,
date.truncate('week', d, {dayOfWeek: 2}) AS truncWeek,
date.truncate('day', d) AS truncDay

Result

truncMilleni truncCentu truncDecad truncYear truncWeek truncQuart truncMonth truncWeek truncDay


um ry e Year er

2000-01-01 2000-01-01 2010-01-01 2017-01-05 2017-01-02 2017-10-01 2017-11-01 2017-11-07 2017-11-11

Rows: 1

datetime()
Details

Syntax datetime([ input ])

Description Creates a ZONED DATETIME instant.

Arguments Name Type Description

input ANY Either a string representation


of a temporal value, a map
containing the single key
'timezone', or a map
containing temporal values
('year', 'month', 'day', 'hour',
'minute', 'second',
'millisecond', 'microsecond',
'nanosecond', 'timezone') as
components.

Returns ZONED DATETIME

Temporal components

Name Description

year An expression consisting of at least four digits that specifies


the year.

489
Name Description

month An integer between 1 and 12 that specifies the month.

day An integer between 1 and 31 that specifies the day of the


month.

hour An integer between 0 and 23 that specifies the hour of the


day.

minute An integer between 0 and 59 that specifies the number of


minutes.

second An integer between 0 and 59 that specifies the number of


seconds.

millisecond An integer between 0 and 999 that specifies the number of


milliseconds.

microsecond An integer between 0 and 999,999 that specifies the number


of microseconds.

nanosecond An integer between 0 and 999,999,999 that specifies the


number of nanoseconds.

timezone An expression that specifies the timezone.

epochSeconds A numeric value representing the number of seconds from


the UNIX epoch in the UTC timezone.

epochMillis A numeric value representing the number of milliseconds


from the UNIX epoch in the UTC timezone.

Considerations

If no parameters are provided, datetime() must be invoked (datetime({}) is invalid).

The month component will default to 1 if month is omitted.

The day of the month component will default to 1 if day is omitted.

The hour component will default to 0 if hour is omitted.

The minute component will default to 0 if minute is omitted.

The second component will default to 0 if second is omitted.

Any missing millisecond, microsecond or nanosecond values will default to 0.

The timezone component will default to the configured default timezone if timezone is omitted.

If millisecond, microsecond and nanosecond are given in combination (as part of the same set of parameters), the individual
values must be in the range 0 to 999.

The smallest components in the set year, month, day, hour, minute, and second may be omitted; i.e. it is possible to specify
only year, month and day, but specifying year, month, day and minute is not permitted.

One or more of millisecond, microsecond and nanosecond can only be specified as long as second is also specified.

String representations of temporal values must comply with the format defined for dates, times and time zones.

String representations of temporal values must denote a valid date; i.e. a temporal value denoting 30 February 2001 is
invalid.

490
If any of the optional parameters are provided, these will override the corresponding components of datetime, date and/or
time.

datetime(dd) may be written instead of datetime({datetime: dd}).

Selecting a ZONED TIME or ZONED DATETIME value as the time component also selects its timezone. If a LOCAL TIME or LOCAL
DATETIME is selected instead, the default timezone is used. In any case, the timezone can be overridden explicitly.

Selecting a ZONED DATETIME as the datetime component and overwriting the timezone will adjust the local time to keep the
same point in time.

Selecting a ZONED DATETIME or ZONED TIME as the time component and overwriting the timezone will adjust the local time to
keep the same point in time.

epochSeconds/epochMillis may be used in conjunction with nanosecond.

datetime(null) returns null.

Example 249. .datetime() to get the current datetime (no parameters provided)

Query

RETURN datetime() AS currentDateTime

The current date and time using the local timezone is returned.

Result

currentDateTime

2022-06-14T10:02:28.192Z

Rows: 1

Example 250. datetime() with provided timezone

Query

RETURN datetime({timezone: 'America/Los Angeles'}) AS currentDateTimeInLA

The current date and time of day in California is returned.

Result

currentDateTimeInLA

2022-06-14T03:02:28.238-07:00[America/Los_Angeles]

Rows: 1

Creating ZONED DATETIME values

491
Example 251. datetime() - Creating a calendar (Year-Month-Day) ZONED DATETIME

Query

UNWIND [
datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14, millisecond: 123,
microsecond: 456, nanosecond: 789}),
datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14, millisecond: 645,
timezone: '+01:00'}),
datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14, nanosecond: 645876123,
timezone: 'Europe/Stockholm'}),
datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14, timezone: '+01:00'}),
datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14}),
datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, timezone: 'Europe/Stockholm'}),
datetime({year: 1984, month: 10, day: 11, hour: 12, timezone: '+01:00'}),
datetime({year: 1984, month: 10, day: 11, timezone: 'Europe/Stockholm'})
] AS theDate
RETURN theDate

Result

theDate

1984-10-11T12:31:14.123456789Z

1984-10-11T12:31:14.645+01:00

1984-10-11T12:31:14.645876123+01:00[Europe/Stockholm]

1984-10-11T12:31:14+01:00

1984-10-11T12:31:14Z

1984-10-11T12:31+01:00[Europe/Stockholm]

1984-10-11T12:00+01:00

1984-10-11T00:00+01:00[Europe/Stockholm]

Rows: 8

492
Example 252. datetime() - Creating a week (Year-Week-Day) ZONED DATETIME

Query

UNWIND [
datetime({year: 1984, week: 10, dayOfWeek: 3, hour: 12, minute: 31, second: 14, millisecond: 645}),
datetime({year: 1984, week: 10, dayOfWeek: 3, hour: 12, minute: 31, second: 14, microsecond: 645876,
timezone: '+01:00'}),
datetime({year: 1984, week: 10, dayOfWeek: 3, hour: 12, minute: 31, second: 14, nanosecond:
645876123, timezone: 'Europe/Stockholm'}),
datetime({year: 1984, week: 10, dayOfWeek: 3, hour: 12, minute: 31, second: 14, timezone:
'Europe/Stockholm'}),
datetime({year: 1984, week: 10, dayOfWeek: 3, hour: 12, minute: 31, second: 14}),
datetime({year: 1984, week: 10, dayOfWeek: 3, hour: 12, timezone: '+01:00'}),
datetime({year: 1984, week: 10, dayOfWeek: 3, timezone: 'Europe/Stockholm'})
] AS theDate
RETURN theDate

Result

theDate

1984-03-07T12:31:14.645Z

1984-03-07T12:31:14.645876+01:00

1984-03-07T12:31:14.645876123+01:00[Europe/Stockholm]

1984-03-07T12:31:14+01:00[Europe/Stockholm]

1984-03-07T12:31:14Z

1984-03-07T12:00+01:00

1984-03-07T00:00+01:00[Europe/Stockholm]

Rows: 7

Example 253. datetime() - Creating a quarter (Year-Quarter-Day) ZONED DATETIME

Query

UNWIND [
datetime({year: 1984, quarter: 3, dayOfQuarter: 45, hour: 12, minute: 31, second: 14, microsecond:
645876}),
datetime({year: 1984, quarter: 3, dayOfQuarter: 45, hour: 12, minute: 31, second: 14, timezone:
'+01:00'}),
datetime({year: 1984, quarter: 3, dayOfQuarter: 45, hour: 12, timezone: 'Europe/Stockholm'}),
datetime({year: 1984, quarter: 3, dayOfQuarter: 45})
] AS theDate
RETURN theDate

Result

theDate

1984-08-14T12:31:14.645876Z

1984-08-14T12:31:14+01:00

1984-08-14T12:00+02:00[Europe/Stockholm]

1984-08-14T00:00Z

Rows: 4

493
Example 254. datetime() - Creating an ordinal (Year-Day) ZONED DATETIME

Query

UNWIND [
datetime({year: 1984, ordinalDay: 202, hour: 12, minute: 31, second: 14, millisecond: 645}),
datetime({year: 1984, ordinalDay: 202, hour: 12, minute: 31, second: 14, timezone: '+01:00'}),
datetime({year: 1984, ordinalDay: 202, timezone: 'Europe/Stockholm'}),
datetime({year: 1984, ordinalDay: 202})
] AS theDate
RETURN theDate

Result

theDate

1984-07-20T12:31:14.645Z

1984-07-20T12:31:14+01:00

1984-07-20T00:00+02:00[Europe/Stockholm]

1984-07-20T00:00Z

Rows: 4

Example 255. datetime() - Creating a ZONED DATETIME from a STRING

Query

UNWIND [
datetime('2015-07-21T21:40:32.142+0100'),
datetime('2015-W30-2T214032.142Z'),
datetime('2015T214032-0100'),
datetime('20150721T21:40-01:30'),
datetime('2015-W30T2140-02'),
datetime('2015202T21+18:00'),
datetime('2015-07-21T21:40:32.142[Europe/London]'),
datetime('2015-07-21T21:40:32.142-04[America/New_York]')
] AS theDate
RETURN theDate

Result

theDate

2015-07-21T21:40:32.142+01:00

2015-07-21T21:40:32.142Z

2015-01-01T21:40:32-01:00

2015-07-21T21:40-01:30

2015-07-20T21:40-02:00

2015-07-21T21:00+18:00

2015-07-21T21:40:32.142+01:00[Europe/London]

2015-07-21T21:40:32.142-04:00[America/New_York]

Rows: 8

494
Example 256. datetime() - Creating a ZONED DATETIME using other temporal values as components

The following query shows the various usages of datetime({date [, year, ..., timezone]}).

Query

WITH date({year: 1984, month: 10, day: 11}) AS dd


RETURN
datetime({date: dd, hour: 10, minute: 10, second: 10}) AS dateHHMMSS,
datetime({date: dd, hour: 10, minute: 10, second: 10, timezone:'+05:00'}) AS dateHHMMSSTimezone,
datetime({date: dd, day: 28, hour: 10, minute: 10, second: 10}) AS dateDDHHMMSS,
datetime({date: dd, day: 28, hour: 10, minute: 10, second: 10, timezone:'Pacific/Honolulu'}) AS
dateDDHHMMSSTimezone

Result

dateHHMMSS dateHHMMSSTimezone dateDDHHMMSS dateDDHHMMSSTimezone

1984-10-11T10:10:10Z 1984-10-11T10:10:10+05:00 1984-10-28T10:10:10Z 1984-10-28T10:10:10-


10:00[Pacific/Honolulu]

Rows: 1

Example 257. datetime() - Creating a ZONED DATETIME using other temporal values as components

The following query shows the various usages of datetime({time [, year, …, timezone]}).

Query

WITH time({hour: 12, minute: 31, second: 14, microsecond: 645876, timezone: '+01:00'}) AS tt
RETURN
datetime({year: 1984, month: 10, day: 11, time: tt}) AS YYYYMMDDTime,
datetime({year: 1984, month: 10, day: 11, time: tt, timezone:'+05:00'}) AS YYYYMMDDTimeTimezone,
datetime({year: 1984, month: 10, day: 11, time: tt, second: 42}) AS YYYYMMDDTimeSS,
datetime({year: 1984, month: 10, day: 11, time: tt, second: 42, timezone: 'Pacific/Honolulu'}) AS
YYYYMMDDTimeSSTimezone

Result

YYYYMMDDTime +YYYYMMDDTimeTimezon YYYYMMDDTimeSS YYYYMMDDTimeSSTimezo


e ne

1984-10- 1984-10- 1984-10- 1984-10-


11T12:31:14.645876+01:00 11T16:31:14.645876+05:00 11T12:31:42.645876+01:00 11T01:31:42.645876-
10:00[Pacific/Honolulu]

Rows: 1

495
Example 258. datetime() - Creating a ZONED DATETIME using other temporal values as components

The following query shows the various usages of datetime({date, time [, year, ...,
timezone]}); i.e. combining a DATE and a ZONED TIME value to create a single ZONED DATETIME value.

Query

WITH
date({year: 1984, month: 10, day: 11}) AS dd,
localtime({hour: 12, minute: 31, second: 14, millisecond: 645}) AS tt
RETURN
datetime({date: dd, time: tt}) AS dateTime,
datetime({date: dd, time: tt, timezone: '+05:00'}) AS dateTimeTimezone,
datetime({date: dd, time: tt, day: 28, second: 42}) AS dateTimeDDSS,
datetime({date: dd, time: tt, day: 28, second: 42, timezone: 'Pacific/Honolulu'}) AS
dateTimeDDSSTimezone

Result

dateTime dateTimeTimezone dateTimeDDSS dateTimeDDSSTimezone

1984-10-11T12:31:14.645Z 1984-10- 1984-10-28T12:31:42.645Z 1984-10-28T12:31:42.645-


11T12:31:14.645+05:00 10:00[Pacific/Honolulu]

Rows: 1

Example 259. datetime() - Creating a ZONED DATETIME using other temporal values as components

The following query shows the various usages of datetime({datetime [, year, ..., timezone]}).

Query

WITH
datetime({
year: 1984, month: 10, day: 11,
hour: 12,
timezone: 'Europe/Stockholm'
}) AS dd
RETURN
datetime({datetime: dd}) AS dateTime,
datetime({datetime: dd, timezone: '+05:00'}) AS dateTimeTimezone,
datetime({datetime: dd, day: 28, second: 42}) AS dateTimeDDSS,
datetime({datetime: dd, day: 28, second: 42, timezone: 'Pacific/Honolulu'}) AS dateTimeDDSSTimezone

Result

dateTime dateTimeTimezone dateTimeDDSS dateTimeDDSSTimezone

1984-10- 1984-10-11T16:00+05:00 1984-10- 1984-10-28T01:00:42-


11T12:00+01:00[Europe/Sto 28T12:00:42+01:00[Europe/ 10:00[Pacific/Honolulu]
ckholm] Stockholm]

Rows: 1

496
Example 260. datetime() - Creating a ZONED DATETIME from UNIX epoch (epocSeconds)

datetime() returns the ZONED DATETIME value at the specified number of seconds or milliseconds from
the UNIX epoch in the UTC timezone.

Conversions to other temporal instant types from UNIX epoch representations can be achieved by
transforming a ZONED DATETIME value to one of these types.

Query

RETURN datetime({epochSeconds: timestamp() / 1000, nanosecond: 23}) AS theDate

Result

theDate

2022-06-14T10:02:30.000000023Z

Rows: 1

Example 261. datetime() - Creating a ZONED DATETIME from UNIX epoch (epocMillis)

Query

RETURN datetime({epochMillis: 424797300000}) AS theDate

Result

theDate

1983-06-18T15:15Z

Rows: 1

datetime.fromEpoch()
Details

Syntax datetime.fromepoch(seconds, nanoseconds)

Description Creates a ZONED DATETIME given the seconds and nanoseconds since the start of the epoch.

Arguments Name Type Description

seconds INTEGER | FLOAT The number of seconds from


the UNIX epoch in the UTC
timezone.

nanoseconds INTEGER | FLOAT The number of nanoseconds


from the UNIX epoch in the
UTC timezone. This can be
added to seconds.

Returns ZONED DATETIME

497
Example 262. datetime.fromEpoch()

Query

WITH datetime.fromepoch(1683000000, 123456789) AS dateTimeFromEpoch


RETURN dateTimeFromEpoch

Result

dateTimeFromEpoch

2023-05-02T04:00:00.123456789Z

Rows: 1

datetime.fromEpochMillis()
Details

Syntax datetime.fromepochmillis(milliseconds)

Description Creates a ZONED DATETIME given the milliseconds since the start of the epoch.

Arguments Name Type Description

milliseconds INTEGER | FLOAT The number of milliseconds


from the UNIX epoch in the
UTC timezone.

Returns ZONED DATETIME

Example 263. datetime.fromEpochMillis()

Query

WITH datetime.fromepochmillis(1724198400000) AS dateTimeFromMillis


RETURN dateTimeFromMillis

Result

dateTimeFromMillis

2024-08-21T00:00Z

Rows: 1

datetime.realtime()
Details

Syntax datetime.realtime([ timezone ])

Description Returns the current ZONED DATETIME instant using the realtime clock.

498
Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns ZONED DATETIME

The returned ZONED DATETIME will be the live clock of the system.

Example 264. datetime.realtime()

Query

RETURN datetime.realtime() AS currentDateTime

Result

currentDateTime

2022-06-14T10:02:28.494444Z

Rows: 1

datetime.statement()
Details

Syntax datetime.statement([ timezone ])

Description Returns the current ZONED DATETIME instant using the statement clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns ZONED DATETIME

This returned ZONED DATETIME will be the same for each invocation within the same statement. However, a
different value may be produced for different statements within the same transaction.

Example 265. datetime.statement()

Query

RETURN datetime.statement() AS currentDateTime

Result

currentDateTime

2022-06-14T10:02:28.395Z

Rows: 1

499
datetime.transaction()
Details

Syntax datetime.transaction([ timezone ])

Description Returns the current ZONED DATETIME instant using the transaction clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns ZONED DATETIME

The returned ZONED DATETIME value will be the same for each invocation within the same transaction.
However, a different value may be produced for different transactions.

Example 266. datetime.transaction()

Query

RETURN datetime.transaction() AS currentDateTime

Result

currentDateTime

2022-06-14T10:02:28.290Z

Rows: 1

Example 267. datetime.transaction()

Query

RETURN datetime.transaction('America/Los Angeles') AS currentDateTimeInLA

Result

currentDateTimeInLA

2022-06-14T03:02:28.338-07:00[America/Los_Angeles]

Rows: 1

datetime.truncate()
Details

Syntax datetime.truncate(unit [, input, fields])

Description Truncates the given temporal value to a ZONED DATETIME instant using the specified unit.

500
Arguments Name Type Description

unit STRING A string representing one of


the following: 'microsecond',
'millisecond', 'second',
'minute', 'hour', 'day', 'week',
'month', 'weekYear', 'quarter',
'year', 'decade', 'century',
'millennium'.

input ANY The date to be truncated


using either ZONED DATETIME,
LOCAL DATETIME, or DATE.

fields MAP A list of time components


smaller than those specified
in unit to preserve during
truncation.

Returns ZONED DATETIME

datetime.truncate() returns the ZONED DATETIME value obtained by truncating a specified temporal instant
value at the nearest preceding point in time at the specified component boundary (which is denoted by the
truncation unit passed as a parameter to the function). In other words, the ZONED DATETIME returned will
have all components that are smaller than the specified truncation unit set to their default values.

It is possible to supplement the truncated value by providing a map containing components which are
smaller than the truncation unit. This will have the effect of overriding the default values which would
otherwise have been set for these smaller components. For example, day — with some value x — may be
provided when the truncation unit STRING is 'year' in order to ensure the returned value has the day set to
x instead of the default day (which is 1).

Considerations

input cannot be a DATE value if unit is one of: 'hour', 'minute', 'second', 'millisecond', 'microsecond'.

The timezone of input may be overridden; for example, datetime.truncate('minute', input, {timezone: '+0200'}).

If input is one of ZONED TIME, ZONED DATETIME — a value with a timezone — and the timezone is overridden, no time
conversion occurs.

If input is one of LOCAL DATETIME, DATE — a value without a timezone — and the timezone is not overridden, the configured
default timezone will be used.

Any component that is provided in fields must be smaller than unit; i.e. if unit is 'day', fields cannot contain information
pertaining to a month.

Any component that is not contained in fields and which is smaller than unit will be set to its minimal value.

If fields is not provided, all components of the returned value which are smaller than unit will be set to their default values.

If input is not provided, it will be set to the current date, time and timezone, i.e. datetime.truncate(unit) is equivalent of
datetime.truncate(unit, datetime()).

501
Example 268. datetime.truncate()

Query

WITH
datetime({
year:2017, month:11, day:11,
hour:12, minute:31, second:14, nanosecond: 645876123,
timezone: '+03:00'
}) AS d
RETURN
datetime.truncate('millennium', d, {timezone: 'Europe/Stockholm'}) AS truncMillenium,
datetime.truncate('year', d, {day: 5}) AS truncYear,
datetime.truncate('month', d) AS truncMonth,
datetime.truncate('day', d, {millisecond: 2}) AS truncDay,
datetime.truncate('hour', d) AS truncHour,
datetime.truncate('second', d) AS truncSecond

Result

truncMillenium truncYear truncMonth truncDay truncHour truncSecond

2000-01- 2017-01- 2017-11- 2017-11- 2017-11- 2017-11-


01T00:00+01:00[E 05T00:00+03:00 01T00:00+03:00 11T00:00:00.002+ 11T12:00+03:00 11T12:31:14+03:0
urope/Stockholm] 03:00 0

Rows: 1

localdatetime()
Details

Syntax localdatetime([ input ])

Description Creates a LOCAL DATETIME instant.

Arguments Name Type Description

input ANY Either a string representation


of a temporal value, a map
containing the single key
'timezone', or a map
containing temporal values
('year', 'month', 'day', 'hour',
'minute', 'second',
'millisecond', 'microsecond',
'nanosecond') as
components.

Returns LOCAL DATETIME

Temporal components

Name Description

A single map consisting of the following:

year An expression consisting of at least four digits that specifies


the year.

502
Name Description

month An integer between 1 and 12 that specifies the month.

day An integer between 1 and 31 that specifies the day of the


month.

hour An integer between 0 and 23 that specifies the hour of the


day.

minute An integer between 0 and 59 that specifies the number of


minutes.

second An integer between 0 and 59 that specifies the number of


seconds.

millisecond An integer between 0 and 999 that specifies the number of


milliseconds.

microsecond An integer between 0 and 999,999 that specifies the number


of microseconds.

nanosecond An integer between 0 and 999,999,999 that specifies the


number of nanoseconds.

Considerations

If no parameters are provided, localdatetime() must be invoked (localdatetime({}) is invalid).

The month component will default to 1 if month is omitted.

The day of the month component will default to 1 if day is omitted.

The hour component will default to 0 if hour is omitted.

The minute component will default to 0 if minute is omitted.

The second component will default to 0 if second is omitted.

Any missing millisecond, microsecond or nanosecond values will default to 0.

If millisecond, microsecond and nanosecond are given in combination (as part of the same set of parameters), the individual
values must be in the range 0 to 999.

The smallest components in the set year, month, day, hour, minute, and second may be omitted; i.e. it is possible to specify
only year, month and day, but specifying year, month, day and minute is not permitted.

One or more of millisecond, microsecond and nanosecond can only be specified as long as second is also specified.

String representations of temporal values must comply with the format defined for dates and times.

String representations of temporal values must denote a valid date; i.e. a temporal value denoting 30 February 2001 is
invalid.

localdatetime(null) returns null.

If any of the optional parameters are provided, these will override the corresponding components of datetime, date and/or
time.

localdatetime(dd) may be written instead of localdatetime({datetime: dd}).

503
Example 269. localdatetime() - to get current local date and time (no parameters)

Query

RETURN localdatetime() AS now

The current local date and time (i.e. in the local timezone) is returned.

Result

now

2022-06-14T10:02:30.447

Rows: 1

Example 270. localdatetime() with timezone

Query

RETURN localdatetime({timezone: 'America/Los Angeles'}) AS now

The current local date and time in California is returned.

Result

now

2022-06-14T03:02:30.482

Rows: 1

Creating LOCAL DATETIME values


Example 271. localdatetime() - Creating a calendar (Year-Month-Day) LOCAL DATETIME

Query

RETURN
localdatetime({
year: 1984, month: 10, day: 11,
hour: 12, minute: 31, second: 14, millisecond: 123, microsecond: 456, nanosecond: 789
}) AS theDate

Result

theDate

1984-10-11T12:31:14.123456789

Rows: 1

504
Example 272. localdatetime() - Creating a week (Year-Week-Day) LOCAL DATETIME

Query

RETURN
localdatetime({
year: 1984, week: 10, dayOfWeek: 3,
hour: 12, minute: 31, second: 14, millisecond: 645
}) AS theDate

Result

theDate

1984-03-07T12:31:14.645

Rows: 1

Example 273. localdatetime() - Creating a quarter (Year-Quarter-Day) ZONED DATETIME

Query

RETURN
localdatetime({
year: 1984, quarter: 3, dayOfQuarter: 45,
hour: 12, minute: 31, second: 14, nanosecond: 645876123
}) AS theDate

Result

theDate

1984-08-14T12:31:14.645876123

Rows: 1

Example 274. localdatetime() - Creating an ordinal (Year-Day) LOCAL DATETIME

Query

RETURN
localdatetime({
year: 1984, ordinalDay: 202,
hour: 12, minute: 31, second: 14, microsecond: 645876
}) AS theDate

Result

theDate

1984-07-20T12:31:14.645876

Rows: 1

505
Example 275. localdatetime() - Creating a LOCAL DATETIME from a STRING

Query

UNWIND [
localdatetime('2015-07-21T21:40:32.142'),
localdatetime('2015-W30-2T214032.142'),
localdatetime('2015-202T21:40:32'),
localdatetime('2015202T21')
] AS theDate
RETURN theDate

Result

theDate

2015-07-21T21:40:32.142

2015-07-21T21:40:32.142

2015-07-21T21:40:32

2015-07-21T21:00

Rows: 4

Example 276. localdatetime() - Creating a LOCAL DATETIME using other temporal values as components

The following query shows the various usages of localdatetime({date [, year, ...,
nanosecond]}).

Query

WITH date({year: 1984, month: 10, day: 11}) AS dd


RETURN
localdatetime({date: dd, hour: 10, minute: 10, second: 10}) AS dateHHMMSS,
localdatetime({date: dd, day: 28, hour: 10, minute: 10, second: 10}) AS dateDDHHMMSS

Result

dateHHMMSS dateDDHHMMSS

1984-10-11T10:10:10 1984-10-28T10:10:10

Rows: 1

506
Example 277. localdatetime() - Creating a LOCAL DATETIME using other temporal values as components

The following query shows the various usages of localdatetime({time [, year, ...,
nanosecond]}).

Query

WITH time({hour: 12, minute: 31, second: 14, microsecond: 645876, timezone: '+01:00'}) AS tt
RETURN
localdatetime({year: 1984, month: 10, day: 11, time: tt}) AS YYYYMMDDTime,
localdatetime({year: 1984, month: 10, day: 11, time: tt, second: 42}) AS YYYYMMDDTimeSS

Result

YYYYMMDDTime YYYYMMDDTimeSS

1984-10-11T12:31:14.645876 1984-10-11T12:31:42.645876

Rows: 1

Example 278. localdatetime() - Creating a LOCAL DATETIME using other temporal values as components

The following query shows the various usages of localdatetime({date, time [, year, ...,
nanosecond]}); i.e. combining a DATE and a ZONED TIME value to create a single LOCAL DATETIME value.

Query

WITH
date({year: 1984, month: 10, day: 11}) AS dd,
time({hour: 12, minute: 31, second: 14, microsecond: 645876, timezone: '+01:00'}) AS tt
RETURN
localdatetime({date: dd, time: tt}) AS dateTime,
localdatetime({date: dd, time: tt, day: 28, second: 42}) AS dateTimeDDSS

Result

dateTime dateTimeDDSS

1984-10-11T12:31:14.645876 1984-10-28T12:31:42.645876

Rows: 1

507
Example 279. localdatetime() - Creating a LOCAL DATETIME using other temporal values as components

The following query shows the various usages of localdatetime({datetime [, year, ...,
nanosecond]}).

Query

WITH
datetime({
year: 1984, month: 10, day: 11,
hour: 12,
timezone: '+01:00'
}) AS dd
RETURN
localdatetime({datetime: dd}) AS dateTime,
localdatetime({datetime: dd, day: 28, second: 42}) AS dateTimeDDSS

Result

dateTime dateTimeDDSS

1984-10-11T12:00 1984-10-28T12:00:42

Rows: 1

localdatetime.realtime()
Details

Syntax localdatetime.realtime([ timezone ])

Description Returns the current LOCAL DATETIME instant using the realtime clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns LOCAL DATETIME

The returned LOCAL DATETIME will be the live clock of the system.

Example 280. localdatetime.realtime()

Query

RETURN localdatetime.realtime() AS now

Result

now

2022-06-14T10:02:30.647817

Rows: 1

508
Example 281. localdatetime.realtime()

Query

RETURN localdatetime.realtime('America/Los Angeles') AS nowInLA

Result

nowInLA

2022-06-14T03:02:30.691099

Rows: 1

localdatetime.statement()
Details

Syntax localdatetime.statement([ timezone ])

Description Returns the current LOCAL DATETIME instant using the statement clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns LOCAL DATETIME

The returned LOCAL DATETIME will be the same for each invocation within the same statement. However, a
different value may be produced for different statements within the same transaction.

Example 282. localdatetime.statement()

Query

RETURN localdatetime.statement() AS now

Result

now

2022-06-14T10:02:30.570

Rows: 1

localdatetime.transaction()
Details

Syntax localdatetime.transaction([ timezone ])

Description Returns the current LOCAL DATETIME instant using the transaction clock.

509
Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns LOCAL DATETIME

The returned LOCAL DATETIME will be the same for each invocation within the same transaction. However, a
different value may be produced for different transactions.

Example 283. localdatetime.transaction()

Query

RETURN localdatetime.transaction() AS now

Result

now

2022-06-14T10:02:30.532

Rows: 1

localdatetime.truncate()
Details

Syntax localdatetime.truncate(unit [, input, fields])

Description Truncates the given temporal value to a LOCAL DATETIME instant using the specified unit.

Arguments Name Type Description

unit STRING A string representing one of


the following: 'microsecond',
'millisecond', 'second',
'minute', 'hour', 'day', 'week',
'month', 'weekYear', 'quarter',
'year', 'decade', 'century',
'millennium'.

input ANY The date to be truncated


using either ZONED DATETIME,
LOCAL DATETIME, or DATE.

fields MAP A list of time components


smaller than those specified
in unit to preserve during
truncation.

Returns LOCAL DATETIME

localdatetime.truncate() returns the LOCAL DATETIME value obtained by truncating a specified temporal
instant value at the nearest preceding point in time at the specified component boundary (which is
denoted by the truncation unit passed as a parameter to the function). In other words, the LOCAL DATETIME

510
returned will have all components that are smaller than the specified truncation unit set to their default
values.

It is possible to supplement the truncated value by providing a map containing components which are
smaller than the truncation unit. This will have the effect of overriding the default values which would
otherwise have been set for these smaller components. For example, day — with some value x — may be
provided when the truncation unit STRING is 'year' in order to ensure the returned value has the day set to
x instead of the default day (which is 1).

Considerations

input cannot be a DATE value if unit is one of: 'hour', 'minute', 'second', 'millisecond', 'microsecond'.

Any component that is provided in fields must be smaller than input; i.e. if input is 'day', fields cannot contain
information pertaining to a month.

Any component that is not contained in fields and which is smaller than unit will be set to its minimal value.

If fields is not provided, all components of the returned value which are smaller than unit will be set to their default values.

If input is not provided, it will be set to the current date and time, i.e. localdatetime.truncate(unit) is equivalent of
localdatetime.truncate(unit, localdatetime()).

Example 284. localdatetime.truncate()

Query

WITH
localdatetime({
year: 2017, month: 11, day: 11,
hour: 12, minute: 31, second: 14, nanosecond: 645876123
}) AS d
RETURN
localdatetime.truncate('millennium', d) AS truncMillenium,
localdatetime.truncate('year', d, {day: 2}) AS truncYear,
localdatetime.truncate('month', d) AS truncMonth,
localdatetime.truncate('day', d) AS truncDay,
localdatetime.truncate('hour', d, {nanosecond: 2}) AS truncHour,
localdatetime.truncate('second', d) AS truncSecond

Result

truncMillenium truncYear truncMonth truncDay truncHour truncSecond

2000-01-01T00:00 2017-01-02T00:00 2017-11-01T00:00 2017-11-11T00:00 2017-11- 2017-11-


11T12:00:00.0000 11T12:31:14
00002

Rows: 1

localtime()
Details

Syntax localtime([ input ])

Description Creates a LOCAL TIME instant.

511
Arguments Name Type Description

input ANY Either a string representation


of a temporal value, a map
containing the single key
'timezone', or a map
containing temporal values
('hour, 'minute', 'second',
'millisecond', 'microsecond',
'nanosecond' as components.

Returns LOCAL TIME

Temporal components

Name Description

hour An integer between 0 and 23 that specifies the hour of the


day.

minute An integer between 0 and 59 that specifies the number of


minutes.

second An integer between 0 and 59 that specifies the number of


seconds.

millisecond An integer between 0 and 999 that specifies the number of


milliseconds.

microsecond An integer between 0 and 999,999 that specifies the number


of microseconds.

nanosecond An integer between 0 and 999,999,999 that specifies the


number of nanoseconds.

Considerations

If no parameters are provided, localtime() must be invoked (localtime({}) is invalid).

The hour component will default to 0 if hour is omitted.

The minute component will default to 0 if minute is omitted.

The second component will default to 0 if second is omitted.

Any missing millisecond, microsecond or nanosecond values will default to 0.

If millisecond, microsecond and nanosecond are given in combination (as part of the same set of parameters), the individual
values must be in the range 0 to 999.

The smallest components in the set hour, minute, and second may be omitted; i.e. it is possible to specify only hour and
minute, but specifying hour and second is not permitted.

One or more of millisecond, microsecond and nanosecond can only be specified as long as second is also specified.

String representations of temporal values must comply with the format defined for times.

String representations of temporal values must denote a valid time; i.e. a temporal value denoting 13:46:64 is invalid.

localtime(null) returns null.

512
If any of the optional parameters are provided, these will override the corresponding components of time.

localtime(tt) may be written instead of localtime({time: tt}).

Example 285. localtime() to get the current time (no parameters)

Query

RETURN localtime() AS now

The current local time (i.e. in the local timezone) is returned.

Result

now

10:02:31.596

Rows: 1

Example 286. localtime() with timezone

Query

RETURN localtime({timezone: 'America/Los Angeles'}) AS nowInLA

The current local time in California is returned.

Result

nowInLA

03:02:31.629

Rows: 1

Creating LOCAL TIME values

513
Example 287. localtime()

Query

UNWIND [
localtime({hour: 12, minute: 31, second: 14, nanosecond: 789, millisecond: 123, microsecond: 456}),
localtime({hour: 12, minute: 31, second: 14}),
localtime({hour: 12})
] AS theTime
RETURN theTime

Result

theTime

12:31:14.123456789

12:31:14

12:00

Rows: 3

Example 288. localtime() - Creating a LOCAL TIME from a STRING

Query

UNWIND [
localtime('21:40:32.142'),
localtime('214032.142'),
localtime('21:40'),
localtime('21')
] AS theTime
RETURN theTime

Result

theTime

21:40:32.142

21:40:32.142

21:40

21:00

Rows: 4

514
Example 289. localtime() - Creating a LOCAL TIME using other temporal values as components

Query

WITH time({hour: 12, minute: 31, second: 14, microsecond: 645876, timezone: '+01:00'}) AS tt
RETURN
localtime({time: tt}) AS timeOnly,
localtime({time: tt, second: 42}) AS timeSS

Result

timeOnly timeSS

12:31:14.645876 12:31:42.645876

Rows: 1

localtime.realtime()
Details

Syntax localtime.realtime([ timezone ])

Description Returns the current LOCAL TIME instant using the realtime clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns LOCAL TIME

The returned LOCAL TIME will be the live clock of the system.

Example 290. localtime.realtime()

Query

RETURN localtime.realtime() AS now

Result

now

10:02:31.806895

Rows: 1

localtime.statement()
Details

Syntax localtime.statement([ timezone ])

Description Returns the current LOCAL TIME instant using the statement clock.

515
Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns LOCAL TIME

This returned LOCAL TIME will be the same for each invocation within the same statement. However, a
different value may be produced for different statements within the same transaction.

Example 291. localtime.statement()

Query

RETURN localtime.statement() AS now

Result

now

10:02:31.697

Rows: 1

Example 292. localtime.statement()

Query

RETURN localtime.statement('America/Los Angeles') AS nowInLA

Result

nowInLA

03:02:31.737

Rows: 1

localtime.transaction()
Details

Syntax localtime.transaction([ timezone ])

Description Returns the current LOCAL TIME instant using the transaction clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns LOCAL TIME

The returned LOCAL TIME will be the same for each invocation within the same transaction. However, a

516
different value may be produced for different transactions.

Example 293. localtime.transaction()

Query

RETURN localtime.transaction() AS now

Result

now

10:02:31.662

Rows: 1

localtime.truncate()
Details

Syntax localtime.truncate(unit [, input, fields])

Description Truncates the given temporal value to a LOCAL TIME instant using the specified unit.

Arguments Name Type Description

unit STRING A string representing one of


the following: 'day', 'week',
'month', 'weekYear', 'quarter',
'year', 'decade', 'century',
'millennium'.

input ANY The date to be truncated


using either ZONED DATETIME,
LOCAL DATETIME, LOCAL TIME,
or DATE.

fields MAP A list of time components


smaller than those specified
in unit to preserve during
truncation.

Returns LOCAL TIME

localtime.truncate() returns the LOCAL TIME value obtained by truncating a specified temporal instant
value at the nearest preceding point in time at the specified component boundary (which is denoted by the
truncation unit passed as a parameter to the function). In other words, the LOCAL TIME returned will have
all components that are smaller than the specified truncation unit set to their default values.

It is possible to supplement the truncated value by providing a map containing components which are
smaller than the truncation unit. This will have the effect of overriding the default values which would
otherwise have been set for these smaller components. For example, minute — with some value x — may
be provided when the truncation unit string is 'hour' in order to ensure the returned value has the minute
set to x instead of the default minute (which is 1).

517
Considerations

Truncating time to day — i.e. unit is 'day' — is supported, and yields midnight at the start of the day
(00:00), regardless of the value of input. However, the timezone of input is retained.

Any component that is provided in fields must be smaller than unit; i.e. if unit is 'second', fields cannot contain
information pertaining to a minute.

Any component that is not contained in fields and which is smaller than unit will be set to its minimal value.

If fields is not provided, all components of the returned value which are smaller than unit will be set to their default values.

If input is not provided, it will be set to the current time, i.e. localtime.truncate(unit) is equivalent of
localtime.truncate(unit, localtime()).

Example 294. localtime.truncate()

Query

WITH time({hour: 12, minute: 31, second: 14, nanosecond: 645876123, timezone: '-01:00'}) AS t
RETURN
localtime.truncate('day', t) AS truncDay,
localtime.truncate('hour', t) AS truncHour,
localtime.truncate('minute', t, {millisecond: 2}) AS truncMinute,
localtime.truncate('second', t) AS truncSecond,
localtime.truncate('millisecond', t) AS truncMillisecond,
localtime.truncate('microsecond', t) AS truncMicrosecond

Result

truncDay truncHour truncMinute truncSecond truncMillisecond truncMicrosecond

00:00:00 12:00:00 12:31:00.0020000 12:31:14 12:31:14.6450000 12:31:14.6458760


00 00 00

Rows: 1

time()
Details

Syntax time([ input ])

Description Creates a ZONED TIME instant.

Arguments Name Type Description

input ANY Either a string representation


of a temporal value, a map
containing the single key
'timezone', or a map
containing temporal values
('hour', 'minute', 'second',
'millisecond', 'microsecond',
'nanosecond', 'timezone') as
components.

Returns ZONED TIME

518
Temporal components

Name Description

hour An integer between 0 and 23 that specifies the hour of the


day.

minute An integer between 0 and 59 that specifies the number of


minutes.

second An integer between 0 and 59 that specifies the number of


seconds.

millisecond An integer between 0 and 999 that specifies the number of


milliseconds.

microsecond An integer between 0 and 999,999 that specifies the number


of microseconds.

nanosecond An integer between 0 and 999,999,999 that specifies the


number of nanoseconds.

timezone An expression that specifies the timezone.

Considerations

If no parameters are provided, time() must be invoked (time({}) is invalid).

The hour component will default to 0 if hour is omitted.

The minute component will default to 0 if minute is omitted.

The second component will default to 0 if second is omitted.

Any missing millisecond, microsecond or nanosecond values will default to 0.

The timezone component will default to the configured default timezone if timezone is omitted.

If millisecond, microsecond and nanosecond are given in combination (as part of the same set of parameters), the individual
values must be in the range 0 to 999.

The smallest components in the set hour, minute, and second may be omitted; i.e. it is possible to specify only hour and
minute, but specifying hour and second is not permitted.

One or more of millisecond, microsecond and nanosecond can only be specified as long as second is also specified.

String representations of temporal values must comply with the format defined for times and time zones.

The timezone component will default to the configured default timezone if it is omitted.

String representations of temporal values must denote a valid time; i.e. a temporalValue denoting 15:67 is invalid.

time(null) returns null.

If any of the optional parameters are provided, these will override the corresponding components of time.

time(tt) may be written instead of time({time: tt}).

Selecting a ZONED TIME or ZONED DATETIME value as the time component also selects its timezone. If a LOCAL TIME or LOCAL
DATETIME is selected instead, the default timezone is used. In any case, the timezone can be overridden explicitly.

Selecting a ZONED DATETIME or ZONED TIME as the time component and overwriting the timezone will adjust the local time to
keep the same point in time.

519
Example 295. time() to get the current time (no parameters)

Query

RETURN time() AS currentTime

The current time of day using the local timezone is returned.

Result

currentTime

10:02:32.192Z

Rows: 1

Example 296. time() with timezone

Query

RETURN time({timezone: 'America/Los Angeles'}) AS currentTimeInLA

The current time of day in California is returned.

Result

currentTimeInLA

03:02:32.233-07:00

Rows: 1

Creating ZONED TIME values

520
Example 297. time()

Query

UNWIND [
time({hour: 12, minute: 31, second: 14, millisecond: 123, microsecond: 456, nanosecond: 789}),
time({hour: 12, minute: 31, second: 14, nanosecond: 645876123}),
time({hour: 12, minute: 31, second: 14, microsecond: 645876, timezone: '+01:00'}),
time({hour: 12, minute: 31, timezone: '+01:00'}),
time({hour: 12, timezone: '+01:00'})
] AS theTime
RETURN theTime

Result

theTime

12:31:14.123456789Z

12:31:14.645876123Z

12:31:14.645876000+01:00

12:31:00+01:00

12:00:00+01:00

Rows: 5

521
Example 298. time() - Creating a ZONED TIME from a STRING

Query

UNWIND [
time('21:40:32.142+0100'),
time('214032.142Z'),
time('21:40:32+01:00'),
time('214032-0100'),
time('21:40-01:30'),
time('2140-00:00'),
time('2140-02'),
time('22+18:00')
] AS theTime
RETURN theTime

Result

theTime

21:40:32.142000000+01:00

21:40:32.142000000Z

21:40:32+01:00

21:40:32-01:00

21:40:00-01:30

21:40:00Z

21:40:00-02:00

22:00:00+18:00

Rows: 8

Example 299. time() - Creating a ZONED TIME using other temporal values as components

Query

WITH localtime({hour: 12, minute: 31, second: 14, microsecond: 645876}) AS tt


RETURN
time({time: tt}) AS timeOnly,
time({time: tt, timezone: '+05:00'}) AS timeTimezone,
time({time: tt, second: 42}) AS timeSS,
time({time: tt, second: 42, timezone: '+05:00'}) AS timeSSTimezone

Result

timeOnly timeTimezone timeSS timeSSTimezone

12:31:14.645876Z 12:31:14.645876+05:00 12:31:42.645876Z 12:31:42.645876+05:00

Rows: 1

time.realtime()
Details

522
Syntax time.realtime([ timezone ])

Description Returns the current ZONED TIME instant using the realtime clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns ZONED TIME

The returned ZONED TIME will be the live clock of the system.

Example 300. time.realtime()

Query

RETURN time.realtime() AS currentTime

Result

currentTime

10:02:32.436948Z

Rows: 1

time.statement()
Details

Syntax time.statement([ timezone ])

Description Returns the current ZONED TIME instant using the statement clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns ZONED TIME

The returned ZONED TIME will be the same for each invocation within the same statement. However, a
different value may be produced for different statements within the same transaction.

523
Example 301. time.statement()

Query

RETURN time.statement() AS currentTime

Result

currentTime

10:02:32.317Z

Rows: 1

Example 302. time.statement()

Query

RETURN time.statement('America/Los Angeles') AS currentTimeInLA

Result

currentTimeInLA

03:02:32.351-07:00

Rows: 1

time.transaction()
Details

Syntax time.transaction([ timezone ])

Description Returns the current ZONED TIME instant using the transaction clock.

Arguments Name Type Description

timezone ANY A string value representing a


timezone.

Returns ZONED TIME

The returned ZONED TIME will be the same for each invocation within the same transaction. However, a
different value may be produced for different transactions.

524
Example 303. time.transaction()

Query

RETURN time.transaction() AS currentTime

Result

currentTime

10:02:32.276Z

Rows: 1

time.truncate()
Details

Syntax time.truncate(unit [, input, fields])

Description Truncates the given temporal value to a ZONED TIME instant using the specified unit.

Arguments Name Type Description

unit STRING A string representing one of


the following: 'microsecond',
'millisecond', 'second',
'minute', 'hour', 'day'.

input ANY The date to be truncated


using either ZONED DATETIME,
LOCAL DATETIME, ZONED TIME,
or LOCAL TIME.

fields MAP A list of time components


smaller than those specified
in unit to preserve during
truncation.

Returns ZONED TIME

time.truncate() returns the ZONED TIME value obtained by truncating a specified temporal instant value at
the nearest preceding point in time at the specified component boundary (which is denoted by the
truncation unit passed as a parameter to the function). In other words, the ZONED TIME returned will have
all components that are smaller than the specified truncation unit set to their default values.

It is possible to supplement the truncated value by providing a map containing components which are
smaller than the truncation unit. This will have the effect of overriding the default values which would
otherwise have been set for these smaller components. For example, minute — with some value x — may
be provided when the truncation unit STRING is 'hour' in order to ensure the returned value has the minute
set to x instead of the default minute (which is 1).

Considerations

525
Truncating time to day — i.e. unit is 'day' — is supported, and yields midnight at the start of the day (00:00), regardless of
the value of input. However, the timezone of input is retained.

The timezone of input may be overridden; for example, time.truncate('minute', input, {timezone: '+0200'}).

If input is one of ZONED TIME, ZONED DATETIME — a value with a timezone — and the timezone is overridden, no time
conversion occurs.

If input is one of LOCAL TIME, LOCAL DATETIME, DATE — a value without a timezone — and the timezone is not overridden, the
configured default timezone will be used.

Any component that is provided in fields must be smaller than unit; i.e. if unit is 'second', fields cannot contain
information pertaining to a minute.

Any component that is not contained in fields and which is smaller than unit will be set to its minimal value.

If fields is not provided, all components of the returned value which are smaller than unit will be set to their default values.

If input is not provided, it will be set to the current time and timezone, i.e. time.truncate(unit) is equivalent of
time.truncate(unit, time()).

Example 304. time()

Query

WITH time({hour: 12, minute: 31, second: 14, nanosecond: 645876123, timezone: '-01:00'}) AS t
RETURN
time.truncate('day', t) AS truncDay,
time.truncate('hour', t) AS truncHour,
time.truncate('minute', t) AS truncMinute,
time.truncate('second', t) AS truncSecond,
time.truncate('millisecond', t, {nanosecond: 2}) AS truncMillisecond,
time.truncate('microsecond', t) AS truncMicrosecond

Result

truncDay truncHour truncMinute truncSecond truncMillisecond truncMicrosecond

00:00:00-01:00 12:00:00-01:00 12:31:00-01:00 12:31:14-01:00 12:31:14.6450000 12:31:14.6458760


02-01:00 00-01:00

Rows: 1

User-defined functions
User-defined functions are written in Java, deployed into the database and are called in the same way as
any other Cypher function.

There are two main types of functions that can be developed and used:

Type Description Usage Developing

Scalar For each row the function Using UDF Extending Neo4j (UDF)
takes parameters and returns
a result.

526
Type Description Usage Developing

Aggregating Consumes many rows and Using aggregating UDF Extending Neo4j
produces an aggregated (Aggregating UDF)
result.

User-defined scalar functions


For each incoming row the function takes parameters and returns a single result.

For developing and deploying user-defined functions in Neo4j, see Extending Neo4j → User-defined
functions.

Example 305. Call a user-defined function

This example shows how you invoke a user-defined function called join from Cypher.

This calls the user-defined function org.neo4j.procedure.example.join().

Query

MATCH (n:Member)
RETURN org.neo4j.function.example.join(collect(n.name)) AS members

Result

members

"John,Paul,George,Ringo"

Rows: 1

User-defined aggregation functions


Aggregating functions consume many rows and produces a single aggregated result.

527
Example 306. Call a user-defined aggregation function

This example shows how you invoke a user-defined aggregation function called longestString from
Cypher.

This calls the user-defined function org.neo4j.function.example.longestString().

Query

MATCH (n:Member)
RETURN org.neo4j.function.example.longestString(n.name) AS member

Result

member

"George"

Rows: 1

Vector functions
Vector functions allow you to compute the similarity scores of vector pairs. These vector similarity
functions are identical to those used by Neo4j’s vector search indexes.

vector.similarity.cosine()
Details

Syntax vector.similarity.cosine(a, b)

Description Returns a FLOAT representing the similarity between the argument vectors based on their
cosine.

Arguments Name Type Description

a LIST<INTEGER | FLOAT> A list representing the first


vector.

b LIST<INTEGER | FLOAT> A list representing the second


vector.

Returns FLOAT

For more details, see the vector index documentation.

Considerations

vector.similarity.cosine(NULL, NULL) returns NULL.

vector.similarity.cosine(NULL, b) returns NULL.

vector.similarity.cosine(a, NULL) returns NULL.

Both vectors must be of the same dimension.

528
Both vectors must be valid with respect to cosine similarity.

The implementation is identical to that of the latest available vector index provider (vector-2.0).

vector.similarity.euclidean()
Details

Syntax vector.similarity.euclidean(a, b)

Description Returns a FLOAT representing the similarity between the argument vectors based on their
Euclidean distance.

Arguments Name Type Description

a LIST<INTEGER | FLOAT> A list representing the first


vector.

b LIST<INTEGER | FLOAT> A list representing the second


vector.

Returns FLOAT

For more details, see the vector index documentation.

Considerations

vector.similarity.euclidean(NULL, NULL) returns NULL.

vector.similarity.euclidean(NULL, b) returns NULL.

vector.similarity.euclidean(a, NULL) returns NULL.

Both vectors must be of the same dimension.

Both vectors must be valid with respect to Euclidean similarity.

The implementation is identical to that of the latest available vector index provider (vector-2.0).

529
Example 307. k-Nearest Neighbors

k-nearest neighbor queries return the k entities with the highest similarity scores based on comparing
their associated vectors with a query vector. Such queries can be run against vector indexes in the
form of approximate k-nearest neighbor (k-ANN) queries, whose returned entities have a high
probability of being among the true k nearest neighbors. However, they can also be expressed as an
exhaustive search using vector similarity functions directly. While this is typically significantly slower
than using an index, it is exact rather than approximate and does not require an existing index. This
can be useful for one-off queries on small sets of data.

To create the graph used in this example, run the following query in an empty Neo4j database:

CREATE
(:Node { id: 1, vector: [1.0, 4.0, 2.0]}),
(:Node { id: 2, vector: [3.0, -2.0, 1.0]}),
(:Node { id: 3, vector: [2.0, 8.0, 3.0]});

Given a parameter query (here set to [4.0, 5.0, 6.0]), you can query for the two nearest neighbors
of that query vector by Euclidean distance. This is achieved by matching on all candidate vectors and
ordering on the similarity score:

Query

MATCH (node:Node)
WITH node, vector.similarity.euclidean($query, node.vector) AS score
RETURN node, score
ORDER BY score DESCENDING
LIMIT 2;

This returns the two nearest neighbors.

Result

node score

(:Node {vector: [2.0, 8.0, 3.0], id: 3}) 0.043478261679410934

(:Node {vector: [1.0, 4.0, 2.0], id: 1}) 0.03703703731298447

Rows: 2

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/llm-fundamentals/
ad.adoc

530
GenAI integrations
Neo4j’s Vector indexes and Vector functions allow you to calculate the similarity between node and
relationship properties in a graph. A prerequisite for using these features is that vector embeddings have
been set as properties of these entities. The GenAI plugin enables the creation of such embeddings using
GenAI providers.

To use the GenAI plugin you need an account and API credentials from any of the following GenAI
providers: Vertex AI, OpenAI, Azure OpenAI Label—new 5.18 , and Amazon Bedrock.

To learn more about using embeddings in Neo4j, see Vector indexes → Vectors and embeddings in Neo4j.

For a hands-on guide on how to use the GenAI plugin, see GenAI documentation - Embeddings & Vector
Indexes Tutorial → Create embeddings with cloud AI providers.

Installation
The GenAI plugin is enabled by default in Neo4j Aura.

The plugin needs to be installed on self-managed instances. This is done by moving the neo4j-genai.jar
file from /products to /plugins in the Neo4j home directory, or, if you are using Docker, by starting the
Docker container with the extra parameter --env NEO4J_PLUGINS='["genai"]'. For more information, see
Operations Manual → Configure plugins.

 Prior to Neo4j 5.23, the GenAI plugin was only available on Neo4j Enterprise Edition.

Example graph
The examples on this page use the Neo4j movie recommendations dataset, focusing on the plot and title
properties of Movie nodes.

The graph contains 28863 nodes and 332522 relationships. There are 9083 Movie nodes with a plot and
title property.

To recreate the graph, download and import this dump file to an empty Neo4j database (running version
5.17 or later). Dump files can be imported for both on-prem instances.

The embeddings on this are generated using OpenAI (model text-embedding-ada-002),


 producing 1536-dimensional vectors.

531
Generate a single embedding and store it
Use the genai.vector.encode() function to generate a vector embedding for a single value.

Signature for genai.vector.encode() Function

genai.vector.encode(resource :: STRING, provider :: STRING, configuration :: MAP = {}) :: LIST<FLOAT>

• The resource (a STRING) is the object to transform into an embedding, such as a chunk text or a
node/relationship property.

• The provider (a STRING) is the case-insensitive identifier of the provider to use. See identifiers under
GenAI providers for supported options.

• The configuration (a MAP) contains provider-specific settings, such as which model to invoke, as well
as any required API credentials. See GenAI providers for details of each supported provider. Note that
because this argument may contain sensitive data, it is obfuscated in the query.log. However, if the
function call is misspelled or the query is otherwise malformed, it may be logged without being
obfuscated.

This function sends one API request every time it is called, which may result in a lot of

 overhead in terms of both network traffic and latency. If you want to generate many
embeddings at once, use Generating a batch of embeddings and store them.

Use the db.create.setNodeVectorProperty procedure to store an embedding to a node property.

Signature for db.create.setNodeVectorProperty Procedure

db.create.setNodeVectorProperty(node :: NODE, key :: STRING, vector :: ANY)

Use the db.create.setRelationshipVectorProperty procedure to store an embedding to a relationship


property.

Signature for db.create.setRelationshipVectorProperty Procedure New

db.create.setRelationshipVectorProperty(relationship :: RELATIONSHIP, key :: STRING, vector :: ANY)

• node or relationship is the entity in which the new property will be stored.

• key (a STRING) is the name of the new property containing the embedding.

• vector is the object containing the embedding.

The embeddings are stored as properties on nodes or relationships with the type LIST<INTEGER | FLOAT>.

532
Example 308. Create an embedding from a single property and store it

Create an embedding property for the Godfather

MATCH (m:Movie {title:'Godfather, The'})


WHERE m.plot IS NOT NULL AND m.title IS NOT NULL
WITH m, m.title || ' ' || m.plot AS titleAndPlot ①
WITH m, genai.vector.encode(titleAndPlot, 'OpenAI', { token: $token }) AS propertyVector ②
CALL db.create.setNodeVectorProperty(m, 'embedding', propertyVector) ③
RETURN m.embedding AS embedding

① Concatenate the title and plot of the Movie into a single STRING.

② Create a 1536 dimensional embedding from the titleAndPlot.

③ Store the propertyVector as a new embedding property on The Godfather node.

Result

+----------------------------------------------------------------------------------------------------
+
| embedding
|
+----------------------------------------------------------------------------------------------------
+
| [0.005239539314061403, -0.039358530193567276, -0.0005175105179660022, -0.038706034421920776, ... ]
|
+----------------------------------------------------------------------------------------------------
+

 This result only shows the first 4 of the 1536 numbers in the embedding.

Generating a batch of embeddings and store them


Use the genai.vector.encodeBatch procedure to generate many vector embeddings with a single API
request. This procedure takes a list of resources as an input, and returns the same number of result rows,
instead of a single one.

This procedure attempts to generate embeddings for all supplied resources in a single
API request. Therefore, it is recommended to see the respective provider’s
 documentation for details on, for example, the maximum number of embeddings that
can be generated per request.

Signature for genai.vector.encodeBatch Procedure

genai.vector.encodeBatch(resources :: LIST<STRING>, provider :: STRING, configuration :: MAP = {}) ::


(index :: INTEGER, resource :: STRING, vector :: LIST<FLOAT>)

• The resources (a LIST<STRING>) parameter is the list of objects to transform into embeddings, such as
chunks of text.

• The provider (a STRING) is the case-insensitive identifier of the provider to use. See GenAI providers for
supported options.

• The configuration (a MAP) specifies provider-specific settings such as which model to invoke, as well
as any required API credentials. See GenAI providers for details of each supported provider. Note that

533
because this argument may contain sensitive data, it is obfuscated in the query.log. However, if the
function call is misspelled or the query is otherwise malformed, it may be logged without being
obfuscated.

Each returned row contains the following columns:

• The index (an INTEGER) is the index of the corresponding element in the input list, to aid in correlating
results back to inputs.

• The resource (a STRING) is the name of the input resource.

• The vector (a LIST<FLOAT>) is the generated vector embedding for this resource.

Example 309. Create embeddings from a limited number of properties and store them

MATCH (m:Movie WHERE m.plot IS NOT NULL)


WITH m
LIMIT 20
WITH collect(m) AS moviesList ①
WITH moviesList, [movie IN moviesList | movie.title || ': ' || movie.plot] AS batch ②
CALL genai.vector.encodeBatch(batch, 'OpenAI', { token: $token }) YIELD index, vector
WITH moviesList, index, vector
CALL db.create.setNodeVectorProperty(moviesList[index], 'embedding', vector) ③

① Collect all 20 Movie nodes into a LIST<NODE>.

② Use a list comprehension ([]) to extract the title and plot properties of the movies in moviesList
into a new LIST<STRING>.

③ db.create.setNodeVectorProperty is run for each vector returned by genai.vector.encodeBatch,


and stores that vector as a property named embedding on the corresponding node.

534
Example 310. Create embeddings from a large number of properties and store them

MATCH (m:Movie WHERE m.plot IS NOT NULL)


WITH collect(m) AS moviesList ①
count(*) AS total,
100 AS batchSize ②
UNWIND range(0, total, batchSize) AS batchStart ③
CALL (moviesList, batchStart, batchSize) { ④
WITH [movie IN moviesList[batchStart .. batchStart + batchSize] | movie.title || ': ' ||
movie.plot] AS resources ⑤
CALL genai.vector.encodeBatch(batch, 'OpenAI', { token: $token }) YIELD index, vector
CALL db.create.setNodeVectorProperty(moviesList[batchStart + index], 'embedding', vector) ⑥
} IN TRANSACTIONS OF 1 ROW ⑦

① Collect all returned Movie nodes into a LIST<NODE>.

② batchSize defines the number of nodes in moviesList to be processed at once. Because vector
embeddings can be very large, a larger batch size may require significantly more memory on the
Neo4j server. Too large a batch size may also exceed the provider’s threshold.

③ Process Movie nodes in increments of batchSize.

④ A CALL subquery executes a separate transaction for each batch. Note that this CALL subquery
uses a variable scope clause (introduced in Neo4j 5.23) to import variables. If you are using an
older version of Neo4j, use an importing WITH clause instead.

⑤ resources is a list of strings, each being the concatenation of title and plot of one movie.

⑥ The procedure sets vector as value for the property named embedding for the node at position
batchStart + index in the moviesList.

⑦ Set to 1 the amount of batches to be processed at once.

This example may not scale to larger datasets, as collect(m) requires the whole
result set to be loaded in memory. For an alternative method more suitable to
 processing large amounts of data, see GenAI documentation - Embeddings &
Vector Indexes Tutorial → Create embeddings with cloud AI providers.

GenAI providers
The following GenAI providers are supported for generating vector embeddings. Each provider has its own
configuration map that can be passed to genai.vector.encode or genai.vector.encodeBatch.

Vertex AI
• Identifier (provider argument): "VertexAI"

• Official Vertex AI documentation

535
Vertex AI provider details

Configuration map

Key Type Description Default

token STRING API access token. Required

projectId STRING GCP project ID. Required

model STRING The name of the model you want to "textembedding-gecko@001"

invoke.

Supported values:

• "textembedding-gecko@001" New

• "textembedding-gecko@002" New

• "textembedding-gecko@003" New

• "textembedding-gecko-
multilingual@001" New

536
Key Type Description Default

region STRING GCP region where to send the API "us-central1"

requests.

Supported values:

• "us-west1"

• "us-west2"

• "us-west3"

• "us-west4"

• "us-central1"

• "us-east1"

• "us-east4"

• "us-south1"

• "northamerica-northeast1"

• "northamerica-northeast2"

• "southamerica-east1"

• "southamerica-west1"

• "europe-west2"

• "europe-west1"

• "europe-west4"

• "europe-west6"

• "europe-west3"

• "europe-north1"

• "europe-central2"

• "europe-west8"

• "europe-west9"

• "europe-southwest1"

• "asia-south1"

• "asia-southeast1"

• "asia-southeast2"

• "asia-east2"

• "asia-east1"

• "asia-northeast1"

• "asia-northeast2"

• "australia-southeast1"

"australia-southeast2"
537
"asia-northeast3"
Key Type •
Description Default

taskType STRING The


• "me-west1"
intended downstream application
(see provider documentation). The
specified taskType will apply to all
resources in a batch. New

title STRING The title of the document that is being


encoded (see provider documentation).
The specified title will apply to all
resources in a batch. New

OpenAI
• Identifier (provider argument): "OpenAI"

• Official OpenAI documentation

OpenAI provider details

Configuration map

Key Type Description Default

token STRING API access token. Required

model STRING The name of the model you want to "text-embedding-ada-002"

invoke.

dimensions INTEGER The number of dimensions you want to Model-dependent.


reduce the vector to. Only supported for
certain models.

Azure OpenAI Label—new 5.18


• Identifier (provider argument): "AzureOpenAI"

• Official Azure OpenAI documentation

Unlike the other providers, the model is configured when creating the deployment on
 Azure, and is thus not part of the configuration map.

538
Azure OpenAI provider details

Configuration map

Key Type Description Default

token STRING API access token. Required

resource STRING The name of the resource to which the Required


model has been deployed.

deployment STRING The name of the model deployment. Required

dimensions INTEGER The number of dimensions you want to Model-dependent.


reduce the vector to. Only supported for
certain models.

Amazon Bedrock
• Identifier (provider argument): "Bedrock"

• Official Bedrock documentation

539
Amazon Bedrock provider details

Configuration map

Key Type Description Default

accessKeyId STRING AWS access key ID. Required

secretAccessKe STRING AWS secret key. Required


y

model STRING The name of the model you want to "amazon.titan-embed-text-v1"

invoke.

Supported values:

• "amazon.titan-embed-text-v1"

region STRING AWS region where to send the API "us-east-1"

requests.

Supported values:

• "us-east-1"

• "us-west-2"

• "ap-southeast-1"

• "ap-northeast-1"

• "eu-central-1"

540
Indexes
An index is a copy of specified primary data in a Neo4j database, such as nodes, relationships, or
properties. The data stored in the index provides an access path to the data in the primary storage and
allows users to evaluate query filters more efficiently (and, in some cases, semantically interpret query
filters). In short, much like indexes in a book, their function in a Neo4j graph database is to make data
retrieval more efficient.

Once an index has been created, it will be automatically populated and updated by the DBMS.

Neo4j supports two categories of indexes:

• Search-performance indexes, for speeding up data retrieval based on exact matches. This category
includes range, text, point, and token lookup indexes.

• Semantic indexes, for approximate matches and to compute similarity scores between a query string
and the matching data. This category includes full-text and vector indexes.

Search-performance indexes
Search-performance indexes enable quicker retrieval of exact matches between an index and the primary
data storage. There are four different search-performance indexes available in Neo4j:

• Range indexes: Neo4j’s default index. Supports most types of predicates.

• Text indexes: solves predicates operating on STRING values. Optimized for queries filtering with the
STRING operators CONTAINS and ENDS WITH.

• Point indexes: solves predicates on spatial POINT values. Optimized for queries filtering on distance or
within bounding boxes.

• Token lookup indexes: only solves node label and relationship type predicates (i.e. they cannot solve
any predicates filtering on properties). Two token lookup indexes (one for node labels and one for
relationship types) are present when a database is created in Neo4j.

To learn more about creating, listing, and deleting these indexes, as well as more details about the
predicates supported by each index type, see Create, show, and delete indexes.

For information about how indexes impact the performance of Cypher queries, as well as some heuristics
for when to use (and not to use) a search-performance index, see The impact of indexes on query
performance.

Search-performance indexes are used automatically, and if several indexes are available, the Cypher
planner will try to use the index (or indexes) that can most efficiently solve a particular predicate. It is,
however, possible to explicitly force a query to use a particular index with the USING keyword. For more
information, see Index hints for the Cypher planner.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/cypher-indexes-
constraints/ad.adoc

541
Create, show, and delete indexes
This page describes how to create, list, and delete search-performance indexes. The following index types
are included in this category:

• Range indexes

• Text indexes

• Point indexes

• Token lookup indexes

For information about how search-performance indexes are used in Cypher queries, see Using search-
performance indexes.

CREATE INDEX
Creating an index is done with the CREATE … INDEX … command. If no index type is specified in the create
command a range index will be created.

Best practice is to give the index a name when it is created. If the index is not explicitly named, it gets an
auto-generated name.

 The index name must be unique among both indexes and constraints.

The CREATE INDEX command is optionally idempotent. This mean that its default behavior is to throw an
error if an attempt is made to create the same index twice. If IF NOT EXISTS is appended to the command,
no error is thrown and nothing happens should an index with the same name or same schema and index
type already exist. It may still throw an error if conflicting constraints exist, such as constraints with the
same name or schema and backing index type. As of Neo4j 5.17, an informational notification is instead
returned showing the existing index which blocks the creation.

[8]
Index providers and configuration settings can be specified using the OPTIONS clause.

However, not all indexes have available configuration settings or more than one provider. In those cases,
nothing needs to be specified and the OPTIONS map should be omitted from the query.

 Creating an index requires the CREATE INDEX privilege.

A newly created index is not immediately available but is created in the background.

Create a range index

Creating a range index can be done with the CREATE INDEX command. Note that the index name must be
unique.

Range indexes have only one index provider available, range-1.0, and no supported index configuration.

542
Supported predicates

Range indexes support most types of predicates:

Predicate Syntax

Equality check.
n.prop = value

List membership check.


n.prop IN list

Existence check.
n.prop IS NOT NULL

Range search.
n.prop > value

Prefix search.
STARTS WITH

Examples

• Create a single-property range index for nodes

• Create a single-property range index for relationships

• Create a composite range index for nodes

• Create a composite range index for relationships

• Create a range index using a parameter

• Create a range index only if it does not already exist

Create a single-property range index for nodes

The following statement will create a named range index on all nodes labeled with Person and which have
the surname property.

Creating a node range index on a single property

CREATE INDEX node_range_index_name FOR (n:Person) ON (n.surname)

Create a single-property range index for relationships

The following statement will create a named range index on all relationships with relationship type KNOWS
and property since.

543
Creating a relationship range index on a single property

CREATE INDEX rel_range_index_name FOR ()-[r:KNOWS]-() ON (r.since)

Create a composite range index for nodes

A range index on multiple properties is also called a composite index. For node range indexes, only nodes
with the specified label and that contain all the specified properties will be added to the index.

The following statement will create a named composite range index on all nodes labeled with Person and
which have both an age and country property.

Creating a composite node range index on multiple properties

CREATE INDEX composite_range_node_index_name FOR (n:Person) ON (n.age, n.country)

Create a composite range index for relationships

A range index on multiple properties is also called a composite index. For relationship range indexes, only
relationships with the specified type and that contain all the specified properties will be added to the
index.

The following statement will create a named composite range index on all relationships labeled with
PURCHASED and which have both a date and amount property.

Creating a composite relationship range index on multiple properties

CREATE INDEX composite_range_rel_index_name FOR ()-[r:PURCHASED]-() ON (r.date, r.amount)

Create a range index using a parameter

This feature was introduced in Neo4j 5.16.

The following statement will create a named range index on all nodes with a Person label and a firstname
property using a parameter for the index name.

Parameters

{
"name": "range_index_param"
}

Creating a node range index on a single property

CREATE INDEX $name FOR (n:Person) ON (n.firstname)

Create a range index only if it does not already exist

544
If it is not known whether an index exists or not, add IF NOT EXISTS to ensure it does.

Creating a range index with IF NOT EXISTS

CREATE INDEX node_range_index_name IF NOT EXISTS


FOR (n:Person) ON (n.surname)

The index will not be created if there already exists an index with the same schema and type, same name
or both. As of Neo4j 5.17, an informational notification is instead returned.

Notification

`CREATE RANGE INDEX node_range_index_name IF NOT EXISTS FOR (e:Person) ON (e.surname)` has no effect.
`RANGE INDEX node_range_index_name FOR (e:Person) ON (e.surname)` already exists.

Create a text index

Creating a text index can be done with the CREATE TEXT INDEX command. Note that the index name must
be unique.

Text indexes have no supported index configuration and, as of Neo4j 5.1, they have two index providers
available, text-2.0 (default) and text-1.0 (deprecated).

Supported predicates

Text indexes only solve predicates operating on STRING values.

The following predicates that only operate on STRING values are always solvable by a text index:

• STARTS WITH

• ENDS WITH

• CONTAINS

However, other predicates are only used when it is known that the property is compared to a STRING:

• n.prop = "string"

• n.prop IN ["a", "b", "c"]

This means that a text index is not able to solve, for example, e.g. a.prop = b.prop, unless a property type
constraint also exists on the property.

Text indexes support the following predicates:

Predicate Syntax

Equality check.
n.prop = 'example_string'

545
Predicate Syntax

List membership check.


n.prop IN ['abc', 'example_string', 'neo4j']

Prefix search.
STARTS WITH

Suffix search.
ENDS WITH

Substring search.
CONTAINS

As of Neo4j 5.11, the above set of predicates can be extended with the use of property type constraints.
See the section about index compatibility and property type constraints for more information.

Text indexes are only used for exact query matches. To perform approximate matches

 (including, for example, variations and typos), and to compute a similarity score between
STRING values, use semantic full-text indexes instead.

Examples

• Create a node text index

• Create a relationship text index

• Create a text index using a parameter

• Create a text index only if it does not already exist

• Create a text index specifying the index provider

Create a node text index

The following statement will create a named text index on all nodes labeled with Person and which have
the nickname STRING property.

Creating a node text index on a single property

CREATE TEXT INDEX node_text_index_nickname FOR (n:Person) ON (n.nickname)

Create a relationship text index

The following statement will create a named text index on all relationships with relationship type KNOWS
and STRING property interest.

546
Creating a relationship text index on a single property

CREATE TEXT INDEX rel_text_index_name FOR ()-[r:KNOWS]-() ON (r.interest)

Create a text index using a parameter

This feature was introduced in Neo4j 5.16.

The following statement will create a named text index on all nodes with the Person label the
favoriteColor STRING property using a parameter for the index name.

Parameters

{
"name": "text_index_param"
}

Creating a node text index on a single property

CREATE TEXT INDEX $name FOR (n:Person) ON (n.favoriteColor)

Create a text index only if it does not already exist

If it is not known whether an index exists or not, add IF NOT EXISTS to ensure it does.

The following statement will attempt to create a named text index on all nodes labeled with Person and
which have the nickname STRING property.

Creating a text index with IF NOT EXISTS

CREATE TEXT INDEX node_index_name IF NOT EXISTS FOR (n:Person) ON (n.nickname)

Note that the index will not be created if there already exists an index with the same schema and type,
same name or both. As of Neo4j 5.17, an informational notification is instead returned.

Notification

`CREATE TEXT INDEX node_index_name IF NOT EXISTS FOR (e:Person) ON (e.nickname)` has no effect.
`TEXT INDEX node_text_index_nickname FOR (e:Person) ON (e.nickname)` already exists.

Create a text index specifying the index provider

To create a text index with a specific index provider, the OPTIONS clause is used. The valid values for the
index provider are text-2.0 and text-1.0 (deprecated). The default provider is text-2.0.

Creating a text index with index provider

CREATE TEXT INDEX text_index_with_indexprovider FOR ()-[r:TYPE]-() ON (r.prop1)


OPTIONS {indexProvider: 'text-2.0'}

547
There is no supported index configuration for text indexes.

Create a point index

Creating a point index can be done with the CREATE POINT INDEX command. Note that the index name
must be unique.

Point indexes have supported index configuration, but only one index provider available, point-1.0.

Supported predicates

Point indexes only solve predicates operating on POINT values.

Point indexes support the following predicates:

Predicate Syntax

Property point value.


n.prop = point({x: value, y: value})

Within bounding box.


point.withinBBox(n.prop, lowerLeftCorner,
upperRightCorner)

Distance.
point.distance(n.prop, center) < = distance

As of Neo4j 5.11, the above set of predicates can be extended with the use of property type constraints.
See Index compatibility and property type constraints for more information.

To learn more about the spatial data types supported by Cypher, see the page about
 Spatial values.

Examples

• Create a node point index

• Create a relationship point index

• Create a point index using a parameter

• Create a point index only if it does not already exist

• Create a point index specifying the index configuration

Create a node point index

548
The following statement will create a named point index on all nodes labeled with Person and which have
the sublocation POINT property.

Creating a node point index on a single property

CREATE POINT INDEX node_point_index_name FOR (n:Person) ON (n.sublocation)

Create a relationship point index

The following statement will create a named point index on all relationships with relationship type STREET
and POINT property intersection.

Creating a relationship point index on a single property

CREATE POINT INDEX rel_point_index_name FOR ()-[r:STREET]-() ON (r.intersection)

Create a point index using a parameter

This feature was introduced in Neo4j 5.16.

The following statement will create a named point index on all relationships with relationship type STREET
and POINT property coordinate using a parameter for the index name.

Parameters

{
"name": "point_index_param"
}

Creating a relationship point index on a single property

CREATE POINT INDEX $name FOR ()-[r:STREET]-() ON (r.coordinate)

Create a point index only if it does not already exist

If it is not known whether an index exists or not, add IF NOT EXISTS to ensure it does.

Creating a point index with IF NOT EXISTS

CREATE POINT INDEX node_point_index IF NOT EXISTS


FOR (n:Person) ON (n.sublocation)

Note that the index will not be created if there already exists an index with the same schema and type,
same name or both. As of Neo4j 5.17, an informational notification is instead returned.

Notification

`CREATE POINT INDEX node_point_index IF NOT EXISTS FOR (e:Person) ON (e.sublocation)` has no effect.
`POINT INDEX node_point_index_name FOR (e:Person) ON (e.sublocation)` already exists.

549
Create a point index specifying the index configuration

To create a point index with a specific index configuration, the indexConfig settings in the OPTIONS clause.
The valid configuration settings are:

• spatial.cartesian.min (default value: [-1000000.0, -1000000.0])

• spatial.cartesian.max (default value: [1000000.0, 1000000.0])

• spatial.cartesian-3d.min (default value: [-1000000.0, -1000000.0, -1000000.0])

• spatial.cartesian-3d.max (default value: [1000000.0, 1000000.0, 1000000.0`])

• spatial.wgs-84.min (default value: [-180.0, -90.0])

• spatial.wgs-84.max (default value: [-180.0, -90.0])

• spatial.wgs-84-3d.min (default value: [-180.0, -90.0, -1000000.0])

• spatial.wgs-84-3d.max (default value: [180.0, 90.0, 1000000.0])

The following statement will create a point index specifying the spatial.cartesian.min and
spatial.cartesian.max settings.

Creating a point index with index configuration

CREATE POINT INDEX point_index_with_config


FOR (n:Label) ON (n.prop2)
OPTIONS {
indexConfig: {
`spatial.cartesian.min`: [-100.0, -100.0],
`spatial.cartesian.max`: [100.0, 100.0]
}
}

Note that the wgs-84 and 3D cartesian settings, which are not specified in this example, will be set with
their respective default values.

Create a token lookup index

Two token lookup indexes are created by default when creating a Neo4j database (one node label lookup
index and one relationship type lookup index). Only one node label and one relationship type lookup index
can exist at the same time.

If a token lookup index has been deleted, it can be recreated with the CREATE LOOKUP INDEX command.
Note that the index name must be unique.

Token lookup indexes have only one index provider available, token-lookup-1.0, and no supported index
configuration.

Supported predicates

Token lookup indexes are present by default and solve only node label and relationship type predicates:

550
Predicate Syntax (example)

Node label predicate.


MATCH (n:Label)

MATCH (n)
WHERE n:Label

Relationship type predicate.


MATCH ()-[r:REL]->()

MATCH ()-[r]->()
WHERE r:REL

Token lookup indexes improve the performance of Cypher queries and the population of
 other indexes. Dropping these indexes may lead to severe performance degradation.

Examples

• Create a node label lookup index

• Create a relationship type lookup index

• Create a token lookup index only if it does not already exist

Create a node label lookup index

The following statement will create a named node label lookup index on all nodes with one or more labels:

Creating a node label lookup index

CREATE LOOKUP INDEX node_label_lookup_index FOR (n) ON EACH labels(n)

 Only one node label lookup index can exist at a time.

Create a relationship type lookup index

The following statement will create a named relationship type lookup index on all relationships with any
relationship type.

Creating a relationship type lookup index

CREATE LOOKUP INDEX rel_type_lookup_index FOR ()-[r]-() ON EACH type(r)

 Only one relationship type lookup index can exist at a time.

551
Create a token lookup index only if it does not already exist

If it is not known whether an index exists or not, add IF NOT EXISTS to ensure it does.

Creating a node label lookup index with IF NOT EXISTS

CREATE LOOKUP INDEX node_label_lookup IF NOT EXISTS FOR (n) ON EACH labels(n)

The index will not be created if there already exists an index with the same schema and type, same name
or both. As of Neo4j 5.17, an informational notification is instead returned.

Notification

`CREATE LOOKUP INDEX node_label_lookup IF NOT EXISTS FOR (e) ON EACH labels(e)` has no effect.
`LOOKUP INDEX node_label_lookup_index FOR (e) ON EACH labels(e)` already exists.

Creating an index when a conflicting index or constraint exists

• Failure to create an already existing index

• Failure to create an index with the same name as an already existing index

• Failure to create an index when a constraint already exists

• Failure to create an index with the same name as an already existing constraint

Failure to create an already existing index

Create an index on the property title on nodes with the Book label, when that index already exists.

Creating a duplicate index

CREATE INDEX bookTitleIndex FOR (book:Book) ON (book.title)

In this case, the index can not be created because it already exists.

Error message

There already exists an index (:Book {title}).

Using IF NOT EXISTS when creating the index would result in no error and would not create a new index.

Failure to create an index with the same name as an already existing index

Create a named index on the property numberOfPages on nodes with the Book label, when an index with
the given name already exists. The index type of the existing index does not matter.

Creating an index with a duplicated name

CREATE INDEX indexOnBooks FOR (book:Book) ON (book.numberOfPages)

552
In this case, the index cannot be created because there already exists an index with the given name.

Error message

There already exists an index called 'indexOnBooks'.

Using IF NOT EXISTS when creating the index would result in no error and would not create a new index.

Failure to create an index when a constraint already exists

Create an index on the property isbn on nodes with the Book label, when an index-backed constraint
already exists on that schema. This is only relevant for range indexes.

Creating a range index on same schema as existing index-backed constraint

CREATE INDEX bookIsbnIndex FOR (book:Book) ON (book.isbn)

In this case, the index can not be created because an index-backed constraint already exists on that label
and property combination.

Error message

There is a uniqueness constraint on (:Book {isbn}), so an index is already created that matches this.

Failure to create an index with the same name as an already existing constraint

Create a named index on the property numberOfPages on nodes with the Book label, when a constraint with
the given name already exists.

Creating an index with same name as an existing constraint

CREATE INDEX bookRecommendations FOR (book:Book) ON (book.recommendations)

In this case, the index can not be created because there already exists a constraint with the given name.

Error message

There already exists a constraint called 'bookRecommendations'.

SHOW INDEXES
Listing indexes can be done with SHOW INDEXES.

 Listing indexes requires the SHOW INDEX privilege.

Examples

553
• Listing all indexes

• Listing specific columns

• Listing indexes with filtering

Listing all indexes

To list all indexes with the default output columns, the SHOW INDEXES command can be used. If all columns
are required, use SHOW INDEXES YIELD *.

Showing all indexes

SHOW INDEXES

554
Result

+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----+
| id | name | state | populationPercent | type | entityType |
labelsOrTypes | properties | indexProvider | owningConstraint | lastRead |
readCount |
+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----+
| 3 | "composite_range_node_index_name" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Person"] | ["age", "country"] | "range-1.0" | NULL | NULL | 0
|
| 4 | "composite_range_rel_index_name" | "ONLINE" | 100.0 | "RANGE" | "RELATIONSHIP" |
["PURCHASED"] | ["date", "amount"] | "range-1.0" | NULL | 2023-03-13T11:41:44.537Z | 1
|
| 16 | "example_index" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Book"] | ["title"] | "range-1.0" | NULL | 2023-04-10T15:41:44.537Z | 2
|
| 17 | "indexOnBooks" | "ONLINE" | 100.0 | "TEXT" | "NODE" |
["Label1"] | ["prop1"] | "text-2.0" | NULL | NULL | 0
|
| 14 | "node_label_lookup_index" | "ONLINE" | 100.0 | "LOOKUP" | "NODE" | NULL
| NULL | "token-lookup-1.0" | NULL | 2023-04-13T08:11:15.537Z | 10 |
| 10 | "node_point_index_name" | "ONLINE" | 100.0 | "POINT" | "NODE" |
["Person"] | ["sublocation"] | "point-1.0" | NULL | 2023-04-05T16:21:44.692Z | 1
|
| 1 | "node_range_index_name" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Person"] | ["surname"] | "range-1.0" | NULL | 2022-12-30T02:01:44.537Z | 6
|
| 6 | "node_text_index_nickname" | "ONLINE" | 100.0 | "TEXT" | "NODE" |
["Person"] | ["nickname"] | "text-2.0" | NULL | 2023-04-13T11:41:44.537Z | 2
|
| 12 | "point_index_param" | "ONLINE" | 100.0 | "POINT" | "RELATIONSHIP" |
["STREET"] | ["coordinate"] | "point-1.0" | NULL | NULL | 0
|
| 13 | "point_index_with_config" | "ONLINE" | 100.0 | "POINT" | "NODE" |
["Label"] | ["prop2"] | "point-1.0" | NULL | NULL | 0
|
| 5 | "range_index_param" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Person"] | ["firstname"] | "range-1.0" | NULL | 2023-12-13T08:23:53.338Z | 2
|
| 11 | "rel_point_index_name" | "ONLINE" | 100.0 | "POINT" | "RELATIONSHIP" |
["STREET"] | ["intersection"] | "point-1.0" | NULL | 2023-03-03T13:37:42.537Z | 2
|
| 2 | "rel_range_index_name" | "ONLINE" | 100.0 | "RANGE" | "RELATIONSHIP" |
["KNOWS"] | ["since"] | "range-1.0" | NULL | 2023-04-12T10:41:44.692Z | 5
|
| 7 | "rel_text_index_name" | "ONLINE" | 100.0 | "TEXT" | "RELATIONSHIP" |
["KNOWS"] | ["interest"] | "text-2.0" | NULL | 2023-04-01T10:40:44.537Z | 3
|
| 15 | "rel_type_lookup_index" | "ONLINE" | 100.0 | "LOOKUP" | "RELATIONSHIP" | NULL
| NULL | "token-lookup-1.0" | NULL | 2023-04-12T21:41:44.537Z | 7 |
| 8 | "text_index_param" | "ONLINE" | 100.0 | "TEXT" | "NODE" |
["Person"] | ["favoriteColor"] | "text-2.0" | NULL | NULL | 0
|
| 9 | "text_index_with_indexprovider" | "ONLINE" | 100.0 | "TEXT" | "RELATIONSHIP" |
["TYPE"] | ["prop1"] | "text-2.0" | NULL | NULL | 0
|
| 18 | "uniqueBookIsbn" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Book"] | ["isbn"] | "range-1.0" | "uniqueBookIsbn" | 2023-04-13T11:41:44.692Z | 6
|
+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----+
18 rows

One of the output columns from SHOW INDEXES is the name of the index. This can be used to drop the index
with the DROP INDEX command.

Listing specific columns

555
It is possible to return only specific columns of the available indexes using the YIELD clause:

Returning specific columns for all indexes

SHOW INDEXES
YIELD name, type, indexProvider AS provider, options, createStatement
RETURN name, type, provider, options.indexConfig AS config, createStatement

Result

+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------+
| name | type | provider | config
| createStatement
|
+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------+
| "composite_range_node_index_name" | "RANGE" | "range-1.0" | {}
| "CREATE RANGE INDEX `composite_range_node_index_name` FOR (n:`Person`) ON (n.`age`, n.`country`)"
|
| "composite_range_rel_index_name" | "RANGE" | "range-1.0" | {}
| "CREATE RANGE INDEX `composite_range_rel_index_name` FOR ()-[r:`PURCHASED`]-() ON (r.`date`,
r.`amount`)"
|
| "example_index" | "RANGE" | "range-1.0" | {}
| "CREATE RANGE INDEX `example_index` FOR (n:`Book`) ON (n.`title`)"
|
| "indexOnBooks" | "TEXT" | "text-2.0" | {}
| "CREATE TEXT INDEX `indexOnBooks` FOR (n:`Label1`) ON (n.`prop1`) OPTIONS {indexConfig: {},
indexProvider: 'text-2.0'}"
|
| "index_343aff4e" | "LOOKUP" | "token-lookup-1.0" | {}
| "CREATE LOOKUP INDEX `index_343aff4e` FOR (n) ON EACH labels(n)"
|
| "index_f7700477" | "LOOKUP" | "token-lookup-1.0" | {}
| "CREATE LOOKUP INDEX `index_f7700477` FOR ()-[r]-() ON EACH type(r)"
|
| "node_point_index_name" | "POINT" | "point-1.0" | {`spatial.cartesian.min`: [-
1000000.0, -1000000.0], `spatial.wgs-84.min`: [-180.0, -90.0], `spatial.wgs-84.max`: [180.0, 90.0],
`spatial.cartesian.max`: [1000000.0, 1000000.0], `spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],
`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0], `spatial.cartesian-3d.max`: [1000000.0,
1000000.0, 1000000.0], `spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0]} | "CREATE POINT INDEX
`node_point_index_name` FOR (n:`Person`) ON (n.`sublocation`) OPTIONS {indexConfig: {`spatial.cartesian-
3d.max`: [1000000.0, 1000000.0, 1000000.0],`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0,
-1000000.0],`spatial.cartesian.max`: [1000000.0, 1000000.0],`spatial.cartesian.min`: [-1000000.0,
-1000000.0],`spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],`spatial.wgs-84-3d.min`: [-180.0, -90.0,
-1000000.0],`spatial.wgs-84.max`: [180.0, 90.0],`spatial.wgs-84.min`: [-180.0, -90.0]}, indexProvider:
'point-1.0'}" |
| "node_range_index" | "RANGE" | "range-1.0" | {}
| "CREATE RANGE INDEX `node_range_index` FOR (n:`Person`) ON (n.`surname`)"
|
| "node_text_index_nickname" | "TEXT" | "text-2.0" | {}
| "CREATE TEXT INDEX `node_text_index_nickname` FOR (n:`Person`) ON (n.`nickname`) OPTIONS {indexConfig:
{}, indexProvider: 'text-2.0'}"
|
| "point_index_with_config" | "POINT" | "point-1.0" | {`spatial.cartesian.min`: [-100.0,
-100.0], `spatial.wgs-84.min`: [-180.0, -90.0], `spatial.wgs-84.max`: [180.0, 90.0],
`spatial.cartesian.max`: [100.0, 100.0], `spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],
`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0], `spatial.cartesian-3d.max`: [1000000.0,

556
1000000.0, 1000000.0], `spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0]} | "CREATE
POINT INDEX `point_index_with_config` FOR (n:`Label`) ON (n.`prop2`) OPTIONS {indexConfig:
{`spatial.cartesian-3d.max`: [1000000.0, 1000000.0, 1000000.0],`spatial.cartesian-3d.min`: [-1000000.0,
-1000000.0, -1000000.0],`spatial.cartesian.max`: [100.0, 100.0],`spatial.cartesian.min`: [-100.0,
-100.0],`spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],`spatial.wgs-84-3d.min`: [-180.0, -90.0,
-1000000.0],`spatial.wgs-84.max`: [180.0, 90.0],`spatial.wgs-84.min`: [-180.0, -90.0]}, indexProvider:
'point-1.0'}" |
| "rel_point_index_name" | "POINT" | "point-1.0" | {`spatial.cartesian.min`: [-
1000000.0, -1000000.0], `spatial.wgs-84.min`: [-180.0, -90.0], `spatial.wgs-84.max`: [180.0, 90.0],
`spatial.cartesian.max`: [1000000.0, 1000000.0], `spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],
`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0], `spatial.cartesian-3d.max`: [1000000.0,
1000000.0, 1000000.0], `spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0]} | "CREATE POINT INDEX
`rel_point_index_name` FOR ()-[r:`STREET`]-() ON (r.`intersection`) OPTIONS {indexConfig:
{`spatial.cartesian-3d.max`: [1000000.0, 1000000.0, 1000000.0],`spatial.cartesian-3d.min`: [-1000000.0,
-1000000.0, -1000000.0],`spatial.cartesian.max`: [1000000.0, 1000000.0],`spatial.cartesian.min`: [-
1000000.0, -1000000.0],`spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],`spatial.wgs-84-3d.min`: [-180.0,
-90.0, -1000000.0],`spatial.wgs-84.max`: [180.0, 90.0],`spatial.wgs-84.min`: [-180.0, -90.0]},
indexProvider: 'point-1.0'}" |
| "rel_range_index_name" | "RANGE" | "range-1.0" | {}
| "CREATE RANGE INDEX `rel_range_index_name` FOR ()-[r:`KNOWS`]-() ON (r.`since`)"
|
| "rel_text_index_name" | "TEXT" | "text-2.0" | {}
| "CREATE TEXT INDEX `rel_text_index_name` FOR ()-[r:`KNOWS`]-() ON (r.`interest`) OPTIONS {indexConfig:
{}, indexProvider: 'text-2.0'}"
|
| "text_index_with_indexprovider" | "TEXT" | "text-2.0" | {}
| "CREATE TEXT INDEX `text_index_with_indexprovider` FOR ()-[r:`TYPE`]-() ON (r.`prop1`) OPTIONS
{indexConfig: {}, indexProvider: 'text-2.0'}"
|
| "uniqueBookIsbn" | "RANGE" | "range-1.0" | {}
| "CREATE CONSTRAINT `uniqueBookIsbn` FOR (n:`Book`) REQUIRE (n.`isbn`) IS UNIQUE OPTIONS {indexConfig:
{}, indexProvider: 'range-1.0'}"
|
+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------+

Note that YIELD is mandatory if the RETURN clause is used. RETURN is not, however, mandatory when the
YIELD clause is used.

Listing indexes with filtering

The SHOW INDEX command can be filtered in various ways.

For example, to show only range indexes, use SHOW RANGE INDEXES.

Another more flexible way of filtering the output is to use the WHERE clause. An example is to only show
indexes not belonging to constraints.

To show only range indexes that does not belong to a constraint we can combine the filtering versions.

Showing range indexes

SHOW RANGE INDEXES WHERE owningConstraint IS NULL

557
Result

+---------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------+
| id | name | state | populationPercent | type | entityType |
labelsOrTypes | properties | indexProvider | owningConstraint | lastRead |
readCount |
+---------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------+
| 3 | "composite_range_node_index_name" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Person"] | ["age", "country"] | "range-1.0" | NULL | NULL | 0
|
| 4 | "composite_range_rel_index_name" | "ONLINE" | 100.0 | "RANGE" | "RELATIONSHIP" |
["PURCHASED"] | ["date", "amount"] | "range-1.0" | NULL | 2023-03-13T11:41:44.537Z | 1
|
| 16 | "example_index" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Book"] | ["title"] | "range-1.0" | NULL | 2023-04-10T15:41:44.537Z | 2
|
| 1 | "node_range_index_name" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Person"] | ["surname"] | "range-1.0" | NULL | 2022-12-30T02:01:44.537Z | 6
|
| 5 | "range_index_param" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Person"] | ["firstname"] | "range-1.0" | NULL | 2023-12-13T08:23:53.338Z | 2
|
| 2 | "rel_range_index_name" | "ONLINE" | 100.0 | "RANGE" | "RELATIONSHIP" |
["KNOWS"] | ["since"] | "range-1.0" | NULL | 2023-04-12T10:41:44.692Z | 5
|
+---------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------+
6 rows

This will only return the default output columns.

To get all columns, use:

SHOW RANGE INDEXES YIELD * WHERE owningConstraint IS NULL

Result columns for listing indexes

The below table contains the full information about all columns returned by the SHOW INDEXES YIELD *
command:

List indexes output

Column Description Type

id The id of the index. Default output INTEGER

name Name of the index (explicitly set by the user or automatically STRING
assigned). Default output

state Current state of the index. Default output STRING

populationPercent % of index population. Default output FLOAT

type The IndexType of this index (FULLTEXT, LOOKUP, POINT, RANGE, STRING
or TEXT). Default output

entityType Type of entities this index represents (nodes or relationship). STRING


Default output

labelsOrTypes The labels or relationship types of this index. Default output LIST<STRING>

558
Column Description Type

properties The properties of this index. Default output LIST<STRING>

indexProvider The index provider for this index. Default output STRING

owningConstraint The name of the constraint the index is associated with or STRING
null if the index is not associated with any constraint.
Default output

lastRead The last time the index was used for reading. Returns null if ZONED DATETIME
the index has not been read since trackedSince, or if the
statistics are not tracked. Default output New

readCount The number of read queries that have been issued to this INTEGER
index since trackedSince, or null if the statistics are not
tracked. Default output New

trackedSince The time when usage statistics tracking started for this index, ZONED DATETIME
or null if the statistics are not tracked. New

options Information retrieved from the OPTIONS map about the MAP
provider and configuration settings for an index. If neither is
specified when creating the index, this column will return the
default values.

failureMessage The failure description of a failed index. STRING

createStatement Statement used to create the index. STRING

DROP INDEX
An index can be dropped (removed) using the name with the DROP INDEX index_name command. This
command can drop indexes of any type, except those backing constraints. The name of the index can be
found using the SHOW INDEXES command, given in the output column name.

DROP INDEX index_name [IF EXISTS]

The DROP INDEX command is optionally idempotent. This means that its default behavior is to throw an
error if an attempt is made to drop the same index twice. With IF EXISTS, no error is thrown and nothing
happens should the index not exist. As of Neo4j 5.17, an informational notification is instead returned
detailing that the index does not exist.

 Dropping an index requires the DROP INDEX privilege.

Examples

• Drop an index

• Drop an index using a parameter

• Failure to drop an index backing a constraint

• Drop a non-existing index

559
Drop an index

The following statement will attempt to drop the index named example_index.

Dropping an index

DROP INDEX example_index

If an index with that name exists it is removed, if not the command fails.

Drop an index using a parameter

This feature was introduced in Neo4j 5.16.

The following statement will attempt to drop the index named range_index_param using a parameter for
the index name.

Parameters

{
"name": "range_index_param"
}

Dropping an index

DROP INDEX $name

If an index with that name exists it is removed, if not the command fails.

Failure to drop an index backing a constraint

It is not possible to drop indexes that back constraints.

Dropping an index backing a constraint

DROP INDEX uniqueBookIsbn

Error message

Unable to drop index: Index belongs to constraint: `uniqueBookIsbn`

Dropping the index-backed constraint will also remove the backing index. For more information, see Drop
a constraint by name.

Drop a non-existing index

If it is uncertain if an index exists and you want to drop it if it does but not get an error should it not, use IF
EXISTS.

560
The following statement will attempt to drop the index named missing_index_name.

Dropping an index with IF EXISTS

DROP INDEX missing_index_name IF EXISTS

If an index with that name exists it is removed, if not the command does nothing. As of Neo4j 5.17,
additionally, an informational notification is returned.

Notification

`DROP INDEX missing_index_name IF EXISTS` has no effect. `missing_index_name` does not exist.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/cypher-indexes-
constraints/ad.adoc :page-aliases: query-tuning/indexes.adoc

The impact of indexes on query performance


Search-performance indexes enable quicker and more efficient pattern matching by solving a particular
combination of node label/relationship type and property predicate. They are used automatically by the
Cypher planner in MATCH clauses, usually at the start of a query, to scan the graph for the most appropriate
place to start the pattern-matching process.

By examining query execution plans, this page will explain the scenarios in which the various search-
performance indexes are used to improve the performance of Cypher queries. It will also provide some
general heuristics for when to use indexes, and advice about how to avoid over-indexing.

Example graph
The examples on this page center around finding routes and points of interest in Central Park, New York,
based on data provided by OpenStreetMap. The data model contains two node labels:

• OSMNode (Open Street Map Node) — a junction node with geo-spatial properties linking together routes
from specific points.

• PointOfInterest — a subcategory of OSMNode. In addition to geospatial properties, these nodes also


contain information about specific points of interest, such as statues, baseball courts, etc. in Central
Park.

The data model also contains one relationship type: ROUTE, which specifies the distance in meters between
the nodes in the graph.

561
In total, the graph contains 69165 nodes (of which 188 have the label PointOfInterest) and 152077
ROUTE relationships.

To recreate the graph, download and import the 5.0 dump file to an empty Neo4j database. Dump files can
be imported for both on-prem instances.

Token lookup indexes


Two token lookup indexes are present by default when creating a Neo4j database. They store copies of all
node labels and relationship types in the database and only solve node label and relationship type
predicates.

[9]
The following query , which counts the number of PointOfInterest nodes that have value baseball for
the type property, will access the node label lookup index:

Query

PROFILE
MATCH (n:PointOfInterest)
WHERE n.type = 'baseball'
RETURN count(n)

Result

count(n)

26

Rows: 1

Execution plan

+-------------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) |
Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | `count(n)` | 1 | 1 | 0 | 0 |
0/0 | 0.075 | In Pipeline 1 |
| | +----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | count(n) AS `count(n)` | 1 | 1 | 0 | 32 |
| | |
| | +----+------------------------+----------------+------+---------+----------------+
| | |
| +Filter | 2 | n.type = $autostring_0 | 9 | 26 | 376 | |
| | |
| | +----+------------------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | 3 | n:PointOfInterest | 188 | 188 | 189 | 376 |
116/0 | 8.228 | Fused in Pipeline 0 |
+-------------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 565, total allocated memory: 472

The following details are worth highlighting in the execution plan:

• The NodeByLabelScan operator accesses the node label lookup index and produces 188 rows,
representing the 188 nodes with the PointOfInterest label in the database.

562
• The query required 565 DB hits (each DB hit represents an instance when the query required access to
the database).

• The query completed in just over 8 milliseconds.

Token lookup indexes are very important because they improve the performance of Cypher queries and
the population of other indexes, and deleting them will lead to severe performance degradation. In the
above example, had a node label lookup index not existed, the NodeByLabelScan operator would have been
replaced with AllNodesScan, which would have had to read all 69165 nodes from the database before
returning a result.

While useful, token lookup indexes will rarely be sufficient for applications querying databases of a non-
trivial size because they cannot solve any property-related predicates.

For more information about the predicates supported by token lookup indexes, see Create, show, and
delete indexes → Token lookup indexes: supported predicates.

Range indexes
Range indexes solve most types of predicates, and they are used for efficiently retrieving data based on a
range of values. They are particularly useful for dealing with properties that have ordered, comparable
values.

The following example first creates a relevant index on the type property for PointOfInterest nodes, and
then runs the above query again, counting the number of PointOfInterest nodes that have a baseball
type value:

Create a range index

CREATE INDEX range_index_type FOR (n:PointOfInterest) ON (n.type)

If no index type is specified when creating an index, Neo4j will default to create a range

 index. For more information about creating indexes, see Create, show, and delete
indexes → CREATE INDEX.

Rerun query after the creation of a relevant index

PROFILE
MATCH (n:PointOfInterest)
WHERE n.type = 'baseball'
RETURN count(n)

563
Execution plan

+-------------------+----+----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details | Estimated Rows
| Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | `count(n)` | 1
| 1 | 0 | 0 | 0/0 | 0.057 | In Pipeline 1 |
| | +----+----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +EagerAggregation | 1 | count(n) AS `count(n)` | 1
| 1 | 0 | 32 | | | |
| | +----+----------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexSeek | 2 | RANGE INDEX n:PointOfInterest(type) WHERE type = $autostring_0 | 5
| 26 | 27 | 376 | 0/1 | 0.945 | Fused in Pipeline 0 |
+-------------------+----+----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 27, total allocated memory: 472

Comparing this query plan with the plan generated before the creation of a relevant range index, the
following has changed:

• NodeByLabelScan has been replaced by NodeIndexSeek. This only produces 26 rows (representing
the 26 PointOfInterest nodes in the database with a type value set to baseball).

• The query now only requires 27 DB hits.

• The query completed in less than 1 millisecond - almost 8 times faster than it took the query to
complete without a range index.

These points all illustrate the fundamental point that search-performance indexes can significantly improve
the performance of Cypher queries.

For more information about the predicates supported by range indexes, see Create, show, and delete
indexes → Range indexes: supported predicates.

Text indexes
Text indexes are used for queries filtering on STRING properties.

If there exists both a range and a text index on a given STRING property, the text index will only be used by
the Cypher planner for queries filtering with the CONTAINS or ENDS WITH operators. In all other cases, the
range index will be used.

To show this behavior, it is necessary to create a text index and a range index on the same property:

Create a text index

CREATE TEXT INDEX text_index_name FOR (n:PointOfInterest) ON (n.name)

564
Create a range index

CREATE INDEX range_index_name FOR (n:PointOfInterest) ON (n.name)

The following query filters all PointOfInterest nodes with a name property that CONTAINS 'William':

Query filtering on what a STRING property CONTAINS

PROFILE
MATCH (n:PointOfInterest)
WHERE n.name CONTAINS 'William'
RETURN n.name AS name, n.type AS type

Result

name type

"William Shakespeare" "statue"

"William Tecumseh Sherman" "equestrian statue"

Rows: 2

Execution plan

+------------------------+----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+------------------------+----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | name, type |
1 | 2 | 0 | 0 | | | |
| | +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Projection | 1 | cache[n.name] AS name, cache[n.type] AS type |
1 | 2 | 0 | | | | |
| | +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +CacheProperties | 2 | cache[n.type], cache[n.name] |
1 | 2 | 6 | | | | |
| | +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexContainsScan | 3 | TEXT INDEX n:PointOfInterest(name) WHERE name CONTAINS $autostring_0 |
1 | 2 | 3 | 248 | 4/0 | 53.297 | Fused in Pipeline 0 |
+------------------------+----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 9, total allocated memory: 312

The plan shows that the query uses the text index to find all relevant nodes. If, however, the query is
changed to use the STARTS WITH operator instead of CONTAINS, the query will use the range index instead:

565
Query filtering on what a STRING property STARTS WITH

PROFILE
MATCH (n:PointOfInterest)
WHERE n.name STARTS WITH 'William'
RETURN n.name, n.type

Execution plan

+-----------------------+----
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------------+----
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | `n.name`, `n.type`
| 1 | 2 | 0 | 0 | | |
|
| | +----
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Projection | 1 | cache[n.name] AS `n.name`, n.type AS `n.type`
| 1 | 2 | 4 | | | |
|
| | +----
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexSeekByRange | 2 | RANGE INDEX n:PointOfInterest(name) WHERE name STARTS WITH $autostring_0,
cache[n.name] | 1 | 2 | 3 | 248 | 4/1 | 1.276 |
Fused in Pipeline 0 |
+-----------------------+----
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 7, total allocated memory: 312

This is because range indexes store STRING values alphabetically. This means that, while they are very
efficient for retrieving exact matches of a STRING, or for prefix matching, they are less efficient for suffix
and contains searches, where they have to scan all relevant properties to filter any matches. Text indexes
do not store STRING properties alphabetically, and are instead optimized for suffix and contains searches.
That said, if no range index had been present on the name property, the previous query would still have
been able to utilize the text index. It would have done so less efficiently than a range index, but it still
would have been useful.

For more information about range index ordering, see the section on Range index-backed ORDER BY.

Text indexes are only used for exact query matches. To perform approximate matches

 (including, for example, variations and typos), and to compute a similarity score between
STRING values, use semantic full-text indexes instead.

For more information about the predicates supported by text indexes, see Create, show, and delete
indexes → Text indexes: supported predicates.

566
Ensuring text index use

In order for the planner to use text indexes, it must be able to confirm that the properties included in the
predicate are STRING values. This is not possible when accessing property values within nodes or
relationships, or values within a MAP, since Cypher does not store the type information of these values. To
ensure text indexes are used in these cases, the toString function should be used.

Text index not used

WITH {name: 'William Shakespeare'} AS varName


MERGE (:PointOfInterest {name:varName.name})

Text index used

WITH {name: 'William Shakespeare'} AS varName


MERGE (:PointOfInterest {name: toString(varName.name)})

For information about how to ensure the use of text indexes when predicates may contain null values, see
Indexes and null values.

Text indexes and STRING sizes

The size of the indexed STRING properties is also relevant to the planner’s selection between range and
text indexes.

Range indexes have a maximum key size limit of around 8 kb. This means that range indexes cannot be
used to index STRING values larger than 8 kb. Text indexes, on the other hand, have a maximum key size
limit of around 32 kb. As a result, they can be used to index STRING values up to that size.

For information about calculating the size of indexes, see Neo4j Knowledge Base → A method to calculate
the size of an index in Neo4j.

Point indexes
Point indexes solve predicates operating on spatial POINT values. Point indexes are optimized for queries
filtering for the distance between property values, or for property values within a bounding box.

The following example creates a point index which is then used in a query returning the name and type of
all PointOfInterest nodes within a set bounding box:

Create a point index

CREATE POINT INDEX point_index_location FOR (n:PointOfInterest) ON (n.location)

567
Query using the point.withinBBox() function

PROFILE
MATCH (n:PointOfInterest)
WHERE point.withinBBox(
n.location,
point({srid: 4326, x: -73.9723702, y: 40.7697989}),
point({srid: 4326, x: -73.9725659, y: 40.770193}))
RETURN n.name AS name, n.type AS type

Result

name type

"Heckscher Ballfield 3" "baseball"

"Heckscher Ballfield 4" "baseball"

"Heckscher Ballfield 1" "baseball"

"Robert Burns" "statue"

"Christopher Columbus" "statue"

"Walter Scott" "statue"

"William Shakespeare" "statue"

"Balto" "statue"

Rows: 8

568
Execution plan

+-----------------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | `n.name`, `n.type`
| 4 | 8 | 0 | 0 | | |
|
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Projection | 1 | cache[n.name] AS `n.name`, cache[n.type] AS `n.type`
| 4 | 8 | 0 | | | |
|
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +CacheProperties | 2 | cache[n.type], cache[n.name]
| 4 | 8 | 24 | | | |
|
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexSeekByRange | 3 | POINT INDEX n:PointOfInterest(location) WHERE point.withinBBox(location,
point($autoint_0, $autodoub | 4 | 8 | 10 | 248 | 302/0 |
2.619 | Fused in Pipeline 0 |
| | | le_1, $autodouble_2), point($autoint_3, $autodouble_4, $autodouble_5))
| | | | | | |
|
+-----------------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 34, total allocated memory: 312

For more information about the predicates supported by point indexes, see Create, show, and delete
indexes → Point indexes: supported predicates.

Point index configuration settings

It is possible to configure point indexes to only index properties within a specific geographical area. This is
done by specifying either of the following settings in the indexConfig part of the OPTIONS clause when
creating a point index:

• spatial.cartesian.min and spatial.cartesian.max: used for Cartesian 2D coordinate systems.

• spatial.cartesian-3d.min and spatial.cartesian-3d.max: used for Cartesian 3D coordinate systems.

• spatial.wgs-84.min and spatial.wgs-84.max: used for WGS-84 2D coordinate systems.

• spatial.wgs-84-3d.min and spatial.wgs-84-3d.max: used for WGS-84 3D coordinate systems.

The min and max of each setting define the minimum and maximum bounds for the spatial data in each
coordinate system.

569
For example, the following index would only store OSMNodes in the northern half of Central Park:

Create point index with configuration settings

CREATE POINT INDEX central_park_north


FOR (o:OSMNode) ON (o.location)
OPTIONS {
indexConfig: {
`spatial.wgs-84.min`:[40.7714, -73.9743],
`spatial.wgs-84.max`:[40.7855, -73.9583]
}
}

Restricting the geographic area of a point index improves the performance of spatial queries. This is
especially beneficial when dealing with complex, large geo-spatial data, and when spatial queries are a
significant part of an application’s functionality.

Composite indexes
It is possible to create a range index on a single property or multiple properties (text and point indexes are
single-property only). The latter are called composite indexes and can be useful if queries against a
database frequently filter on all the properties indexed by the composite index.

The following example first creates a composite index on PointOfInterest nodes for the properties name
and type, and then queries the graph using the shortestPath function to determine both the path length (in
terms of traversed relationships in the graph) and geographical distance between the Zoo School and its
nearest tennis pitch (note that there are 32 unique PointOfInterest tennis pitch nodes in the graph):

Create composite index

CREATE INDEX composite_index FOR (n:PointOfInterest) ON (n.name, n.type)

Query with a filter on both properties indexed by the composite index

PROFILE
MATCH (tennisPitch: PointOfInterest {name: 'pitch', type: 'tennis'})
WITH tennisPitch
MATCH path = shortestPath((tennisPitch)-[:ROUTE*]-(:PointOfInterest {name: 'Zoo School'}))
WITH path, relationships(path) AS relationships
ORDER BY length(path) ASC
LIMIT 1
UNWIND relationships AS rel
RETURN length(path) AS pathLength, sum(rel.distance) AS geographicalDistance

Result

pathLength geographicalDistance

25 2410.4495689536334

Rows: 1

570
Execution plan

+---------------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+------------------+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by
| Pipeline |
+---------------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+------------------+---------------------+
| +ProduceResults | 0 | pathLength, geographicalDistance
| 1 | 1 | 0 | 0 | 0/0 | 0.065 |
| |
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
| |
| +OrderedAggregation | 1 | length(path) AS pathLength, sum(rel.distance) AS geographicalDistance
| 1 | 1 | 50 | 5140 | 31/0 | 4.097 | pathLength ASC
| In Pipeline 3 |
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+------------------+---------------------+
| +Unwind | 2 | relationships AS rel
| 1 | 25 | 0 | 3112 | 0/0 | 0.180 |
| In Pipeline 2 |
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
+---------------------+
| +Projection | 3 | relationships(path) AS relationships
| 0 | 1 | 0 | | 0/0 | 0.050 |
| |
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
| |
| +Top | 4 | `length(path)` ASC LIMIT 1
| 0 | 1 | 0 | 57472 | 0/0 | 1.763 | length(path) ASC
| In Pipeline 1 |
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+------------------+---------------------+
| +Projection | 5 | length(path) AS `length(path)`
| 0 | 32 | 0 | | | |
| |
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ |
+------------------+ |
| +ShortestPath | 6 | path = (tennisPitch)-[anon_0:ROUTE*]-(anon_1)
| 0 | 32 | 181451 | 70080 | | |
| |
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ |
+------------------+ |
| +MultiNodeIndexSeek | 7 | RANGE INDEX tennisPitch:PointOfInterest(name, type) WHERE name =
$autostring_0 AND type = $autostrin | 0 | 31 | 0 | 376 |
131215/1 | 188.723 | | Fused in Pipeline 0 |
| | | g_1, RANGE INDEX anon_1:PointOfInterest(name) WHERE name = $autostring_2
| | | | | | |
| |
+---------------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+------------------+---------------------+

Total database accesses: 181501, total allocated memory: 116040

The query plan shows the composite index being used, and not the previously created range index on the

571
type property. This is because the composite index solves the queried predicate simultaneously, while the
single propertied index would only be able to solve part of the predicate.

Property order and query planning

Like single-property range indexes, composite indexes support all predicates:

• Equality check: n.prop = value

• List membership check: n.prop IN [value, …]

• Existence check: n.prop IS NOT NULL

• Range search: n.prop > value

• Prefix search: n.prop STARTS WITH value

However, the order in which properties are defined when creating a composite index impacts how the
planner will use the index to solve predicates. For example, a composite index on (n.prop1, n.prop2,
n.prop3) will generate a different query plan than a composite index created on (n.prop3, n.prop2,
n.prop1).

The following example shows how composite indexes on the same properties defined in a different order
will generate different execution plans:

Create a composite index on three properties

CREATE INDEX composite_2 FOR (n:PointOfInterest) ON (n.lat, n.name, n.type)

Note the order in which the properties are defined when creating the index, with lat first, name second,
and type last.

Query with a filter on the three indexed properties

PROFILE
MATCH (n:PointOfInterest)
WHERE n.lat = 40.7697989 AND n.name STARTS WITH 'William' AND n.type IS NOT NULL
RETURN n.name AS name

Result

name

"William Shakespeare"

Rows: 1

572
Execution plan

+-----------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | name
| 0 | 0 | 0 | 0 | | |
|
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Projection | 1 | cache[n.name] AS name
| 0 | 0 | 0 | | | |
|
| | +----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexSeek | 2 | RANGE INDEX n:PointOfInterest(lat, name, type) WHERE lat = $autodouble_0 AND name
STARTS WITH $autos | 0 | 0 | 1 | 248 | 0/2 | 1.276
| Fused in Pipeline 0 |
| | | tring_1 AND type IS NOT NULL, cache[n.name]
| | | | | | |
|
+-----------------+----
+------------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 1, total allocated memory: 312

The plan shows the recently created composite index is used. It also shows that the predicates are filtered
as specified in the query (i.e. an equality check on the lat property, a prefix search on the name property,
and an existence check on the type property).

However, if the order of the properties is altered when creating the index, a different query plan will be
generated. To demonstrate this behavior, it is first necessary to drop the recently created composite_2
index and create a new composite index on the same properties defined in a different order:

Drop index

DROP INDEX composite_2

Create a composite index on same three properties defined in a different order

CREATE INDEX composite_3 FOR (n:PointOfInterest) ON (n.name, n.type, n.lat)

Note that the order of the properties has changed: the name property is now the first property defined in
the composite index, and the lat property is indexed last.

573
Rerun query after the creation of a different composite index

PROFILE
MATCH (n:PointOfInterest)
WHERE n.lat = 40.769798 AND n.name STARTS WITH 'William' AND n.type IS NOT NULL
RETURN n.name AS name

Execution plan

+-----------------+----
+-----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------+----
+-----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | name
| 0 | 0 | 0 | 0 | | |
|
| | +----
+-----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Projection | 1 | cache[n.name] AS name
| 0 | 0 | 0 | | | |
|
| | +----
+-----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Filter | 2 | cache[n.lat] = $autodouble_0
| 0 | 0 | 0 | | | |
|
| | +----
+-----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexSeek | 3 | RANGE INDEX n:PointOfInterest(name, type, lat) WHERE name STARTS WITH
$autostring_1 AND type IS NOT | 0 | 2 | 3 | 248 | 2/0
| 0.807 | Fused in Pipeline 0 |
| | | NULL AND lat IS NOT NULL, cache[n.name], cache[n.lat]
| | | | | | |
|
+-----------------+----
+-----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 3, total allocated memory: 312

This plan now shows that, while a prefix search has been used to solve the name property predicate, the
lat property predicate is no longer solved with an equality check, but rather with an existence check and
an explicit filter operation afterward. Note that if the composite_2 index had not been dropped before the
query was rerun, the planner would have used it instead of the composite_3 index.

This is because, when using composite indexes, any predicate after a prefix search will automatically be
planned as an existence check predicate.

Composite index rules

• If a query contains an equality check or a list membership check predicates, they need to be for the first
properties defined when creating the composite index.

574
• Queries utilizing a composite index can contain up to one range search or prefix search predicate.

• There can be any number of existence check predicates.

• Any predicates following a prefix search or an existence check will be planned as existence checks.

• Suffix and substring search predicates can utilize composite indexes. However, they are always
planned as an existence check and any subsequent query predicates will accordingly also be planned
as such. Note that if these predicates are used, and a text index also exists on any of the indexed
(STRING) properties, the planner will use the text index instead of a composite index.

These rules can be important when creating composite indexes, as some checks are more efficient than
others. For instance, it is generally more efficient for the planner to perform an equality check on a property
than an existence check. Depending on the queries and the application, it may, therefore, be cost-effective
to consider the order in which properties are defined when creating a composite index.

Additionally, it bears repeating that composite indexes can only be used if a predicate filters on all the
properties indexed by the composite index, and that composite indexes can only be created for range
indexes.

Range index-backed ORDER BY


Range indexes store properties in ascending order (alphabetically for STRING values, and numerically for
FLOAT and INTEGER values). This can have important implications for query performance, because the
planner may be able to take advantage of a pre-existing index order and therefore not have to perform an
expensive Sort operation later in the query.

To demonstrate this behavior, the following query will filter out any ROUTE relationships with a distance
property less than 30, and return the distance property of the matched relationships in ascending
numerical order using the ORDER BY clause.

Query to return order of results without a relevant index

PROFILE
MATCH ()-[r:ROUTE]-()
WHERE r.distance < 30
RETURN r.distance AS distance
ORDER BY distance

575
Execution plan

+---------------------------------+----+--------------------------------+----------------+-------
+---------+----------------+------------------------+-----------+--------------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits
| Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by | Pipeline |
+---------------------------------+----+--------------------------------+----------------+-------
+---------+----------------+------------------------+-----------+--------------+---------------------+
| +ProduceResults | 0 | distance | 3013 | 6744 | 0
| 0 | 0/0 | 14.397 | | |
| | +----+--------------------------------+----------------+-------
+---------+----------------+------------------------+-----------+ | |
| +Sort | 1 | distance ASC | 3013 | 6744 | 0
| 540472 | 0/0 | 16.844 | distance ASC | In Pipeline 1 |
| | +----+--------------------------------+----------------+-------
+---------+----------------+------------------------+-----------+--------------+---------------------+
| +Projection | 2 | cache[r.distance] AS distance | 3013 | 6744 | 0
| | | | | |
| | +----+--------------------------------+----------------+-------
+---------+----------------+ | +--------------+ |
| +Filter | 3 | cache[r.distance] < $autoint_0 | 3013 | 6744 | 10041
| | | | | |
| | +----+--------------------------------+----------------+-------
+---------+----------------+ | +--------------+ |
| +UndirectedRelationshipTypeScan | 4 | (anon_0)-[r:ROUTE]-(anon_1) | 10044 | 10041 | 5023
| 376 | 84/0 | 22.397 | | Fused in Pipeline 0 |
+---------------------------------+----+--------------------------------+----------------+-------
+---------+----------------+------------------------+-----------+--------------+---------------------+

Total database accesses: 15064, total allocated memory: 540808

This plan shows two important points about indexes and the ordering of results:

• Only the relationship type lookup index was used in this query (accessed by the
UndirectedRelationshipTypeScan operator, which fetches all relationships and their start and end
nodes from the relationship type index).

• As a result, the planner has to perform a Sort operation to order the results by the distance property
(in this case, it required 540472 bytes of memory).

To see how an index could impact the query plan, it is first necessary to create a range index on the
distance property:

Create a range index on a relationship type property

CREATE INDEX range_index_relationships FOR ()-[r:ROUTE]-() ON (r.distance)

Re-running the query, it now generates a different plan:

Rerun query after the creation of a relevant index

PROFILE
MATCH ()-[r:ROUTE]-()
WHERE r.distance < 30
RETURN r.distance AS distance
ORDER BY distance

576
Execution plan

+-----------------------------------------+----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+----------------
+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by |
Pipeline |
+-----------------------------------------+----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+----------------
+---------------------+
| +ProduceResults | 0 | distance
| 301 | 6744 | 0 | 0 | | | |
|
| | +----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | | |
|
| +Projection | 1 | cache[r.distance] AS distance
| 301 | 6744 | 0 | | | | distance ASC |
|
| | +----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | +----------------+
|
| +UndirectedRelationshipIndexSeekByRange | 2 | RANGE INDEX (anon_0)-[r:ROUTE(distance)]-(anon_1) WHERE
distance < $autoint_0, cache[r.distance] | 301 | 6744 | 3373 | 248 |
2361/10 | 76.542 | r.distance ASC | Fused in Pipeline 0 |
+-----------------------------------------+----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+----------------
+---------------------+

Total database accesses: 3373, total allocated memory: 312

Focusing on the same two points in the plan, the following has changed:

• The recently created range index on the relationship type property distance is now used.

• As a result, the plan no longer needs to perform a Sort operation to order the results (because the
distance property is already ordered by the index), and this substantially reduces the cost of the query
(the total memory cost of the query is now 312 bytes).

In the same way, the order of a range index can significantly improve queries using the max() and min()
functions.

Multiple index use


Indexes are principally used to find the starting points of patterns. If a query contains one MATCH clause,
then, as a general rule, only the index that best suits the predicates in that clause will be selected by the
planner. If, however, a query contains two or more MATCH clauses, it is possible to use several indexes.

To show multiple indexes used in one query, the following example will first create a new index on the lon
(longitude) property for PointOfInterest nodes. It then uses a query that finds all PointOfInterest nodes
north of the William Shakespeare statue in Central Park.

Create a range index on the longitude property

CREATE INDEX range_index_lon FOR (n:PointOfInterest) ON (n.lon)

577
Query to find all PointOfInterest nodes north of the William Shakespeare statue

PROFILE
MATCH (ws:PointOfInterest {name:'William Shakespeare'})
WITH ws
MATCH (poi:PointOfInterest)
WHERE poi.lon > ws.lon
RETURN poi.name AS name

Execution plan

+-------------------------+----+-----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-------------------------+----+-----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | name |
9 | 143 | 0 | 0 | | | |
| | +----+-----------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Projection | 1 | poi.name AS name |
9 | 143 | 283 | | | | |
| | +----+-----------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Apply | 2 | |
9 | 143 | 0 | | | | |
| |\ +----+-----------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| | +NodeIndexSeekByRange | 3 | RANGE INDEX poi:PointOfInterest(lon) WHERE lon > ws.lon |
9 | 143 | 146 | 2280 | 233/1 | 1.460 | Fused in Pipeline 1 |
| | +----+-----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +NodeIndexSeek | 4 | RANGE INDEX ws:PointOfInterest(name) WHERE name = $autostring_0 |
2 | 1 | 2 | 376 | 1/0 | 0.635 | In Pipeline 0 |
+-------------------------+----+-----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 431, total allocated memory: 2616

This plan shows that a separate index is used to improve the performance of each MATCH clause (first by
utilizing the index on the name property to find the William Shakespeare node, and then by using the index
on the lon property to find all nodes with a greater longitudinal value).

Indexes and null values


Neo4j indexes do not store null values. This means that the planner must be able to rule out the possibility
of null values in order for queries to use an index.

The following query demonstrates the incompatibility between null values and indexes by counting all
PointOfInterest nodes with an unset name property:

578
Query to count nodes with a null name value

PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NULL
RETURN count(n) AS nodes

Result

nodes

Rows:1

Execution plan

+-------------------+----+-------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page
Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+-------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | nodes | 1 | 1 | 0 | 0 |
0/0 | 0.012 | In Pipeline 1 |
| | +----+-------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | count(n) AS nodes | 1 | 1 | 0 | 32 |
| | |
| | +----+-------------------+----------------+------+---------+----------------+
| | |
| +Filter | 2 | n.name IS NULL | 141 | 3 | 373 | |
| | |
| | +----+-------------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | 3 | n:PointOfInterest | 188 | 188 | 189 | 376 |
115/0 | 0.769 | Fused in Pipeline 0 |
+-------------------+----+-------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 562, total allocated memory: 472

The plan shows that neither of the two available indexes (range and text) on the name property is used to
solve the predicate.

However, if a query predicate is added which is able to exclude the presence of any null values, then an
index can be used. The following query shows this by adding a substring predicate to the above query:

Query to count nodes with a null name value or nodes with a name property containing 'William'

PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NULL OR n.name CONTAINS 'William'
RETURN count(n) AS nodes

Result

nodes

Rows: 1

The query result now includes both the three nodes with an unset name value found in the previous query

579
and the two nodes with a name value containing William (William Shakespeare and William Tecumseh
Sherman).

Execution plan

+--------------------------+----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+--------------------------+----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | nodes |
1 | 1 | 0 | 0 | 0/0 | 0.010 | In Pipeline 3 |
| | +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +EagerAggregation | 1 | count(n) AS nodes |
1 | 1 | 0 | 32 | | | |
| | +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Distinct | 2 | n |
141 | 5 | 0 | 352 | | | |
| | +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Union | 3 | |
142 | 5 | 0 | 352 | 0/0 | 0.220 | Fused in Pipeline 2 |
| |\ +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| | +NodeIndexContainsScan | 4 | TEXT INDEX n:PointOfInterest(name) WHERE name CONTAINS $autostring_0 |
1 | 2 | 3 | 376 | 4/0 | 0.456 | In Pipeline 1 |
| | +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +Filter | 5 | n.name IS NULL |
141 | 3 | 373 | | | | |
| | +----+----------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeByLabelScan | 6 | n:PointOfInterest |
188 | 188 | 189 | 376 | 115/0 | 0.673 | Fused in Pipeline 0 |
+--------------------------+----+----------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 565, total allocated memory: 1352

This plan shows that an index is only used to solve the second part of the WHERE clause, which excludes the
presence of null values.

The presence of null values within an indexed property therefore does not negate the use of an index.
Index use is only negated if the planner is unable to rule out the inclusion of any unset properties in the
matching process.

The presence of null values may not be known in advance, and this can cause unexpected instances of
indexes not being used. There are, however, a few strategies to ensure that an index will be used.

Property existence checks

One method to ensure an index is used is to explicitly filter out any null values by appending IS NOT NULL

580
to the queried property. The following example uses the same query as above but exchanges IS NULL with
IS NOT NULL in the WHERE clause:

Query to count PointOfInterest nodes without a null name value

PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NOT NULL
RETURN count(n) AS nodes

Result

nodes

185

Rows: 1

Execution plan

+-------------------+----+------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | nodes | 1 |
1 | 0 | 0 | 0/0 | 0.013 | In Pipeline 1 |
| | +----+------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | count(n) AS nodes | 1 |
1 | 0 | 32 | | | |
| | +----+------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeIndexScan | 2 | RANGE INDEX n:PointOfInterest(name) WHERE name IS NOT NULL | 185 |
185 | 186 | 376 | 0/1 | 0.691 | Fused in Pipeline 0 |
+-------------------+----+------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 186, total allocated memory: 472

This plan shows that the previously created range index on the name property is now used to solve the
predicate.

Text indexes and type predicate expressions

Text indexes require that predicates only include STRING properties.

To use text indexes in situations where any of the queried properties may be either of an incompatible type
or null rather than a STRING value, add the type predicate expression IS :: STRING NOT NULL (or its alias,
introduced in Neo4j 5.14, IS :: STRING!) to the query. This will enforce both the existence of a property
and its STRING type, discarding any rows where the property is missing or not of type STRING, and thereby
enable the use of text indexes.

For example, if the WHERE predicate in the previous query is altered to instead append IS :: STRING NOT
NULL, then the text index rather than the range index is used (range indexes do not support type predicate
expressions):

581
Query using a type predicate expression

PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS :: STRING NOT NULL
RETURN count(n) AS nodes

Execution plan

+-------------------+----+-----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+-----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | nodes | 1 |
1 | 0 | 0 | 0/0 | 0.009 | In Pipeline 1 |
| | +----+-----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | count(n) AS nodes | 1 |
1 | 0 | 32 | | | |
| | +----+-----------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeIndexScan | 2 | TEXT INDEX n:PointOfInterest(name) WHERE name IS NOT NULL | 185 |
185 | 186 | 376 | 0/0 | 0.343 | Fused in Pipeline 0 |
+-------------------+----+-----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 186, total allocated memory: 472

While type predicate expressions were introduced in Neo4j 5.9, the IS :: STRING NOT

 NULL syntax only became an index-compatible predicate in Neo4j 5.15. For more
information, see the page about type predicate expressions.

The toString function can also be used to convert an expression to STRING values, and thereby help the
planner to select a text index.

Property type constraints Label—new 5.11

For indexes that are compatible only with specific types (i.e. text and point indexes), the Cypher planner
needs to deduce that a predicate will evaluate to null for non-compatible values in order to use the index.
If a predicate is not explicitly defined as the required type (STRING or POINT), this can lead to situations
where a text or point index is not used.

Since property type constraints guarantee that a property is always of the same type, they can be used to
extend the scenarios in which text and point indexes are compatible with a predicate.

To show this, the following example will first drop the existing range index on the name property (this is
necessary because property type constraints only extend the compatibility of type-specific indexes - range
indexes are not limited by a value type). It will then run the same query with a WHERE predicate on the name
property (for which there exists a previously created text index) before and after creating a property type
constraint, and compare the resulting execution plans.

Drop range index

DROP INDEX range_index_name

582
Query to count PointOfInterest nodes without a null name value

PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NOT NULL
RETURN count(n) AS nodes

Execution plan

+-------------------+----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page
Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | nodes | 1 | 1 | 0 | 0 |
0/0 | 0.012 | In Pipeline 1 |
| | +----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | count(n) AS nodes | 1 | 1 | 0 | 32 |
| | |
| | +----+--------------------+----------------+------+---------+----------------+
| | |
| +Filter | 2 | n.name IS NOT NULL | 187 | 185 | 373 | |
| | |
| | +----+--------------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | 3 | n:PointOfInterest | 188 | 188 | 189 | 376 |
259/0 | 0.363 | Fused in Pipeline 0 |
+-------------------+----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 562, total allocated memory: 472

This plan shows that the available text index on the name property was not used to solve the predicate.
This is because the planner was not able to deduce that all name values are of type STRING.

However, if a property type constraint is created to ensure that all name properties have a STRING value, a
different query plan is generated.

Create STRING type constraint on the name property

CREATE CONSTRAINT type_constraint


FOR (n:PointOfInterest) REQUIRE n.name IS :: STRING

Rerun the query after the creation of a property type constraint

PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NOT NULL
RETURN count(n) AS nodes

583
Execution plan

+-------------------+----+-----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+-----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | nodes | 1 |
1 | 0 | 0 | 0/0 | 0.013 | In Pipeline 1 |
| | +----+-----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | count(n) AS nodes | 1 |
1 | 0 | 32 | | | |
| | +----+-----------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeIndexScan | 2 | TEXT INDEX n:PointOfInterest(name) WHERE name IS NOT NULL | 187 |
185 | 186 | 376 | 0/0 | 0.328 | Fused in Pipeline 0 |
+-------------------+----+-----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 186, total allocated memory: 472

Because of the property type constraint on the name property, the planner is now able to deduce that all
name properties are of type STRING, and therefore use the available text index.

Point indexes can be extended in the same way if a property type constraint is created to ensure that all
properties are POINT values.

Note that property existence constraints do not currently leverage index use in the same way.

Heuristics: deciding what to index


While it is impossible to give exact directions on when a search-performance index might be beneficial for
a particular use-case, the following points provide some useful heuristics for when creating an index might
improve query performance:

• Frequent property-based queries: if some properties are used frequently for filtering or matching,
consider creating an index on them.

• Performance optimization: If certain queries are too slow, re-examine the properties that are filtered
on, and consider creating indexes for those properties that may cause bottlenecking.

• High cardinality properties: high cardinality properties have many distinct values (e.g., unique
identifiers, timestamps, or user names). Queries that seek to retrieve such properties will likely benefit
from indexing.

• Complex queries: if queries traverse complex paths in a graph (for example, by involving multiple hops
and several layers of filtering), adding indexes to the properties used in those queries can improve
query performance.

• Experiment and test: It is good practice to experiment with different indexes and query patterns, and
to measure the performance of critical queries with and without different indexes to evaluate their
effectiveness.

Over-indexing: considerations and solutions


Search-performance indexes can significantly improve query performance. They should, however, be used

584
judiciously for the following reasons:

• Storage space: because each index is a secondary copy of the data in the primary database, each
index essentially doubles the amount of storage space occupied by the indexed data.

• Slower write queries: adding indexes impacts the performance of write queries. This is because
indexes are updated with each write query. If a system needs to perform a lot of writes quickly, it may
be counterproductive to have an index on the affected data entities. In other words, if write
performance is crucial for a particular use case, it may be beneficial to only add indexes where they are
necessary for read purposes.

As a result of these two points, deciding what to index (and what not to index) is an important and non-
trivial task.

Keeping track of index-use: lastRead, readCount, and trackedSince

Unused indexes take up unnecessary storage space and it may be beneficial to remove them. Knowing
which indexes are most frequently used by the queries against a database can, however, be difficult. As of
Neo4j 5.8, there are three relevant columns returned by the SHOW INDEX command which can help identify
redundant indexes:

• lastRead: returns the last time the index was used for reading.

• readCount: returns the number of read queries issued to the index.


[10]
• trackedSince returns the time when usage statistics tracking started for an index.

To return these values (along with other relevant information) for the indexes in a database, run the
following query:

Query to identify redundant indexes

SHOW INDEX YIELD name, type, entityType, labelsOrTypes, properties, lastRead, readCount, trackedSince

If any unused indexes are identified, it may be beneficial to delete them using the DROP INDEX command.

Summary
• Range indexes can be used to solve most predicates.

• Text indexes are used over range indexes for CONTAINS and ENDS WITH predicates on STRING properties,
and if the queried STRING properties exceed 8 kb.

• Point indexes are used when queries filter on distances and bounding boxes.

• Token lookup indexes only solve node label and relationship type predicates. They do not solve any
property predicates. Deleting token lookup indexes will negatively impact query performance.

• Composite indexes are only used if the query filters on all properties indexed by the composite index.
The order in which the properties are defined when creating a composite index impacts how the
planner solves query predicates.

• Queries ordering results using ORDER BY can leverage the pre-existing order in range indexes and

585
thereby improve query performance.

• A Cypher query can use several indexes if the planner deems it beneficial to the performance of a
query.

• Neo4j indexes do not store null values, and the planner must be able to rule out any entities with
properties containing null values in order to use an index. There are several strategies to ensure the
use of indexes.

• The columns lastRead, readCount, and trackedSince returned by the SHOW INDEX command can be
used to identify redundant indexes that take up unnecessary space.

Index hints for the Cypher planner


A planner hint is used to influence the decisions of the planner when building an execution plan for a
query. Planner hints are specified in a query with the USING keyword.

Forcing planner behavior is an advanced feature, and should be used with caution by

 experienced developers and/or database administrators only, as it may cause queries to


perform poorly.

When executing a query, Neo4j needs to decide where in the query graph to start matching. This is done
by looking at the MATCH clause and the WHERE conditions and using that information to find useful indexes,
or other starting points.

However, the selected index might not always be the best choice. Sometimes multiple indexes are possible
candidates, and the query planner picks the suboptimal one from a performance point of view. Moreover,
in some circumstances (albeit rarely) it is better not to use an index at all.

Neo4j can be forced to use a specific starting point through the USING keyword. This is called giving a
planner hint.

There are three types of planner hints:

• Index hints.

• Scan hints.

• Join hints.

Query

PROFILE
MATCH
(s:Scientist {born: 1850})-[:RESEARCHED]->
(sc:Science)<-[i:INVENTED_BY {year: 560}]-
(p:Pioneer {born: 525})-[:LIVES_IN]->
(c:City)-[:PART_OF]->
(cc:Country {formed: 411})
RETURN *

The query above will be used in some of the examples on this page. Without any hints, one index and no
join is used.

586
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | c, cc, i, p, s, sc | 0 | 0 |
0 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | s.born = $autoint_0 AND s:Scientist | 0 | 0 |
0 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (sc)<-[anon_0:RESEARCHED]-(s) | 0 | 0 |
0 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | i.year = $autoint_1 AND sc:Science | 0 | 0 |
0 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (p)-[i:INVENTED_BY]->(sc) | 0 | 0 |
0 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | p.born = $autoint_2 AND p:Pioneer | 0 | 0 |
2 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (c)<-[anon_1:LIVES_IN]-(p) | 1 | 1 |
3 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | c:City | 1 | 1 |
2 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (cc)<-[anon_2:PART_OF]-(c) | 1 | 1 |
2 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX cc:Country(formed) WHERE formed = $autoint_3 | 1 | 1 |
2 | 120 | 6/1 | 0.506 | Fused in Pipeline 0 |
+-----------------+----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 11, total allocated memory: 208

Index hints
Index hints are used to specify which index the planner should use as a starting point. This can be
beneficial in cases where the index statistics are not accurate for the specific values that the query at hand
is known to use, which would result in the planner picking a non-optimal index. An index hint is supplied
after an applicable MATCH clause.

Available index hints are:

587
Hint Fulfilled by plans

USING [RANGE | TEXT | POINT] INDEX NodeIndexScan, NodeIndexSeek


variable:Label(property)

USING [RANGE | TEXT | POINT] INDEX SEEK NodeIndexSeek


variable:Label(property)

USING [RANGE | TEXT | POINT] INDEX DirectedRelationshipIndexScan,


variable:RELATIONSHIP_TYPE(property) UndirectedRelationshipIndexScan,
DirectedRelationshipIndexSeek,
UndirectedRelationshipIndexSeek

USING [RANGE | TEXT | POINT] INDEX SEEK DirectedRelationshipIndexSeek,


variable:RELATIONSHIP_TYPE(property) UndirectedRelationshipIndexSeek

When specifying an index type for a hint, e.g. RANGE, TEXT, or POINT, the hint can only be fulfilled when an
index of the specified type is available. When no index type is specified, the hint can be fulfilled by any
index types.

Using a hint must never change the result of a query. Therefore, a hint with a specified

 index type is only fulfillable when the planner knows that using an index of the specified
type does not change the results. Please refer to The use of indexes for more details.

It is possible to supply several index hints, but keep in mind that several starting points will require the use
of a potentially expensive join later in the query plan.

Query using a node index hint

The query above can be tuned to pick a different index as the starting point.

Query

PROFILE
MATCH
(s:Scientist {born: 1850})-[:RESEARCHED]->
(sc:Science)<-[i:INVENTED_BY {year: 560}]-
(p:Pioneer {born: 525})-[:LIVES_IN]->
(c:City)-[:PART_OF]->
(cc:Country {formed: 411})
USING INDEX p:Pioneer(born)
RETURN *

588
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+-----------------------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits
| Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+-----------------------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | c, cc, i, p, s, sc | 0 | 0 | 0
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +Filter | cc.formed = $autoint_3 AND cc:Country | 0 | 0 | 0
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +Expand(All) | (c)-[anon_2:PART_OF]->(cc) | 0 | 0 | 0
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +Filter | c:City | 0 | 0 | 0
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +Expand(All) | (p)-[anon_1:LIVES_IN]->(c) | 0 | 0 | 0
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +Filter | s.born = $autoint_0 AND s:Scientist | 0 | 0 | 0
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +Expand(All) | (sc)<-[anon_0:RESEARCHED]-(s) | 0 | 0 | 0
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +Filter | i.year = $autoint_1 AND sc:Science | 0 | 0 | 2
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +Expand(All) | (p)-[i:INVENTED_BY]->(sc) | 2 | 2 | 6
| | | | |
| | +-----------------------------------------------------+----------------+------+---------
+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX p:Pioneer(born) WHERE born = $autoint_2 | 2 | 2 | 3
| 120 | 4/1 | 0.491 | Fused in Pipeline 0 |
+-----------------+-----------------------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 11, total allocated memory: 208

Query using a node text index hint

The following query can be tuned to pick a text index.

Query

PROFILE
MATCH (c:Country)
USING TEXT INDEX c:Country(name)
WHERE c.name = 'Country7'
RETURN *

589
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | c | 1 | 1 |
0 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeIndexSeek | TEXT INDEX c:Country(name) WHERE name = $autostring_0 | 1 | 1 |
2 | 120 | 2/0 | 0.949 | Fused in Pipeline 0 |
+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 2, total allocated memory: 184

Query using a relationship index hint

The query above can be tuned to pick a relationship index as the starting point.

Query

PROFILE
MATCH
(s:Scientist {born: 1850})-[:RESEARCHED]->
(sc:Science)<-[i:INVENTED_BY {year: 560}]-
(p:Pioneer {born: 525})-[:LIVES_IN]->
(c:City)-[:PART_OF]->
(cc:Country {formed: 411})
USING INDEX i:INVENTED_BY(year)
RETURN *

590
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------------------+---------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+--------------------------------+---------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | c, cc, i, p, s, sc |
0 | 0 | 0 | | | | |
| | +---------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Filter | cc.formed = $autoint_3 AND cc:Country |
0 | 0 | 0 | | | | |
| | +---------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Expand(All) | (c)-[anon_2:PART_OF]->(cc) |
0 | 0 | 0 | | | | |
| | +---------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Filter | c:City |
0 | 0 | 0 | | | | |
| | +---------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Expand(All) | (p)-[anon_1:LIVES_IN]->(c) |
0 | 0 | 0 | | | | |
| | +---------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Filter | s.born = $autoint_0 AND s:Scientist |
0 | 0 | 0 | | | | |
| | +---------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Expand(All) | (sc)<-[anon_0:RESEARCHED]-(s) |
0 | 0 | 0 | | | | |
| | +---------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Filter | p.born = $autoint_2 AND sc:Science AND p:Pioneer |
0 | 0 | 4 | | | | |
| | +---------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +DirectedRelationshipIndexSeek | RANGE INDEX (p)-[i:INVENTED_BY(year)]->(sc) WHERE year = $autoint_1 |
2 | 2 | 3 | 120 | 5/1 | 0.461 | Fused in Pipeline 0 |
+--------------------------------+---------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 7, total allocated memory: 208

Query using a relationship text index hint

The following query can be tuned to pick a text index.

591
Query

PROFILE
MATCH ()-[i:INVENTED_BY]->()
USING TEXT INDEX i:INVENTED_BY(location)
WHERE i.location = 'Location7'
RETURN *

Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------------------
+----------------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+--------------------------------
+----------------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | i
| 1 | 1 | 0 | | | |
|
| |
+----------------------------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +DirectedRelationshipIndexSeek | TEXT INDEX (anon_0)-[i:INVENTED_BY(location)]->(anon_1) WHERE location
= $autostring_0 | 1 | 1 | 2 | 120 | 3/0 | 1.079 |
Fused in Pipeline 0 |
+--------------------------------
+----------------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 2, total allocated memory: 184

Query using multiple index hints

Supplying one index hint changed the starting point of the query, but the plan is still linear, meaning it only
has one starting point. If we give the planner yet another index hint, we force it to use two starting points,
one at each end of the match. It will then join these two branches using a join operator.

Query

PROFILE
MATCH
(s:Scientist {born: 1850})-[:RESEARCHED]->
(sc:Science)<-[i:INVENTED_BY {year: 560}]-
(p:Pioneer {born: 525})-[:LIVES_IN]->
(c:City)-[:PART_OF]->
(cc:Country {formed: 411})
USING INDEX s:Scientist(born)
USING INDEX cc:Country(formed)
RETURN *

592
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | c, cc, i, p, s, sc | 0 | 0 |
0 | | 0/0 | 0.000 | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+ |
| +NodeHashJoin | sc | 0 | 0 |
0 | 432 | | | In Pipeline 2 |
| |\ +----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| | +Expand(All) | (s)-[anon_0:RESEARCHED]->(sc) | 1 | 0 |
0 | | | | |
| | | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| | +NodeIndexSeek | RANGE INDEX s:Scientist(born) WHERE born = $autoint_0 | 1 | 0 |
0 | 120 | 0/0 | 0.000 | Fused in Pipeline 1 |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +Filter | i.year = $autoint_1 AND sc:Science | 0 | 0 |
0 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (p)-[i:INVENTED_BY]->(sc) | 0 | 0 |
0 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | p.born = $autoint_2 AND p:Pioneer | 0 | 0 |
2 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (c)<-[anon_1:LIVES_IN]-(p) | 1 | 1 |
3 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | c:City | 1 | 1 |
2 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (cc)<-[anon_2:PART_OF]-(c) | 1 | 1 |
2 | | | | |
| | +----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX cc:Country(formed) WHERE formed = $autoint_3 | 1 | 1 |
2 | 120 | 7/0 | 0.494 | Fused in Pipeline 0 |
+------------------+----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 11, total allocated memory: 768

Query using multiple index hints with a disjunction

Supplying multiple index hints can also be useful if the query contains a disjunction (OR) in the WHERE
clause. This makes sure that all hinted indexes are used and the results are joined together with a Union
and a Distinct afterwards.

593
Query

PROFILE
MATCH (country:Country)
USING INDEX country:Country(name)
USING INDEX country:Country(formed)
WHERE country.formed = 500 OR country.name STARTS WITH "A"
RETURN *

Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------+------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------------+------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | country |
1 | 1 | 0 | | | | |
| | +------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Distinct | country |
1 | 1 | 0 | 224 | | | |
| | +------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Union | |
2 | 1 | 0 | 80 | 1/0 | 0.213 | Fused in Pipeline 2 |
| |\ +------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| | +NodeIndexSeek | RANGE INDEX country:Country(formed) WHERE formed = $autoint_0 |
1 | 1 | 2 | 120 | 1/0 | 0.101 | In Pipeline 1 |
| | +------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +NodeIndexSeekByRange | RANGE INDEX country:Country(name) WHERE name STARTS WITH $autostring_1 |
1 | 0 | 1 | 120 | 0/1 | 0.307 | In Pipeline 0 |
+-----------------------+------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 3, total allocated memory: 320

Cypher will usually provide a plan that uses all indexes for a disjunction without hints. It may, however,
decide to plan a NodeByLabelScan instead, if the predicates appear to be not very selective. In this case, the
index hints can be useful.

Scan hints
If your query matches large parts of an index, it might be faster to scan the label or relationship type and
filter out rows that do not match. To do this, you can use USING SCAN variable:Label after the applicable
MATCH clause for node indexes, and USING SCAN variable:RELATIONSHIP_TYPE for relationship indexes. This
will force Cypher to not use an index that could have been used, and instead do a label scan/relationship
type scan. You can use the same hint to enforce a starting point where no index is applicable.

594
Hinting a label scan

Query

PROFILE
MATCH
(s:Scientist {born: 1850})-[:RESEARCHED]->
(sc:Science)<-[i:INVENTED_BY {year: 560}]-
(p:Pioneer {born: 525})-[:LIVES_IN]->
(c:City)-[:PART_OF]->
(cc:Country {formed: 411})
USING SCAN s:Scientist
RETURN *

595
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+-----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+-----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | c, cc, i, p, s, sc | 0 | 0 |
0 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | cc.formed = $autoint_3 AND cc:Country | 0 | 0 |
0 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (c)-[anon_2:PART_OF]->(cc) | 0 | 0 |
0 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | c:City | 0 | 0 |
0 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (p)-[anon_1:LIVES_IN]->(c) | 0 | 0 |
0 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | i.year = $autoint_1 AND p.born = $autoint_2 AND p:Pioneer | 0 | 0 |
1 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (sc)<-[i:INVENTED_BY]-(p) | 1 | 1 |
3 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | sc:Science | 1 | 1 |
2 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (s)-[anon_0:RESEARCHED]->(sc) | 1 | 1 |
2 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | s.born = $autoint_0 | 1 | 1 |
200 | | | | |
| | +-----------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeByLabelScan | s:Scientist | 100 | 100 |
101 | 120 | 11/0 | 0.512 | Fused in Pipeline 0 |
+------------------+-----------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 309, total allocated memory: 216

Hinting a relationship type scan

596
Query

PROFILE
MATCH
(s:Scientist {born: 1850})-[:RESEARCHED]->
(sc:Science)<-[i:INVENTED_BY {year: 560}]-
(p:Pioneer {born: 525})-[:LIVES_IN]->
(c:City)-[:PART_OF]->
(cc:Country {formed: 411})
USING SCAN i:INVENTED_BY
RETURN *

597
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------------------
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-------------------------------
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | c, cc, i, p, s, sc
| 0 | 0 | 0 | | | |
|
| |
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | cc.formed = $autoint_3 AND cc:Country
| 0 | 0 | 0 | | | |
|
| |
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (c)-[anon_2:PART_OF]->(cc)
| 0 | 0 | 0 | | | |
|
| |
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | c:City
| 0 | 0 | 0 | | | |
|
| |
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (p)-[anon_1:LIVES_IN]->(c)
| 0 | 0 | 0 | | | |
|
| |
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | s.born = $autoint_0 AND s:Scientist
| 0 | 0 | 0 | | | |
|
| |
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (sc)<-[anon_0:RESEARCHED]-(s)
| 0 | 0 | 0 | | | |
|
| |
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | i.year = $autoint_1 AND p.born = $autoint_2 AND sc:Science AND p:Pioneer
| 0 | 0 | 204 | | | |
|
| |
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +DirectedRelationshipTypeScan | (p)-[i:INVENTED_BY]->(sc)
| 100 | 100 | 101 | 120 | 9/0 | 0.910 | Fused in
Pipeline 0 |
+-------------------------------
+--------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 305, total allocated memory: 208

598
Query using multiple scan hints with a disjunction

Supplying multiple scan hints can also be useful if the query contains a disjunction (OR) in the WHERE clause.
This makes sure that all involved label predicates are solved by a UnionNodeByLabelsScan.

Query

PROFILE
MATCH (person)
USING SCAN person:Pioneer
USING SCAN person:Scientist
WHERE person:Pioneer OR person:Scientist
RETURN *

Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------+--------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) |
Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------+--------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | person | 180 | 200 | 0 | |
| | |
| | +--------------------------+----------------+------+---------+----------------+
| | |
| +UnionNodeByLabelsScan | person:Pioneer|Scientist | 180 | 200 | 202 | 120 |
6/0 | 1.740 | Fused in Pipeline 0 |
+------------------------+--------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 202, total allocated memory: 184

Cypher will usually provide a plan that uses scans for a disjunction without hints. It may, however, decide
to plan an AllNodeScan followed by a Filter instead, if the label predicates appear to be not very selective.
In this case, the scan hints can be useful.

Join hints
Join hints are the most advanced type of hints, and are not used to find starting points for the query
execution plan, but to enforce that joins are made at specified points. This implies that there has to be
more than one starting point (leaf) in the plan, in order for the query to be able to join the two branches
ascending from these leaves. Due to this nature, joins, and subsequently join hints, will force the planner to
look for additional starting points, and in the case where there are no more good ones, potentially pick a
very bad starting point. This will negatively affect query performance. In other cases, the hint might force
the planner to pick a seemingly bad starting point, which in reality proves to be a very good one.

Hinting a join on a single node

In the example above using multiple index hints, we saw that the planner chose to do a join, but not on the
p node. By supplying a join hint in addition to the index hints, we can enforce the join to happen on the p

599
node.

Query

PROFILE
MATCH
(s:Scientist {born: 1850})-[:RESEARCHED]->
(sc:Science)<-[i:INVENTED_BY {year: 560}]-
(p:Pioneer {born: 525})-[:LIVES_IN]->
(c:City)-[:PART_OF]->
(cc:Country {formed: 411})
USING INDEX s:Scientist(born)
USING INDEX cc:Country(formed)
USING JOIN ON p
RETURN *

600
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | c, cc, i, p, s, sc | 0 |
0 | 0 | | 0/0 | 0.000 | |
| | +------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +NodeHashJoin | p | 0 |
0 | 0 | 432 | | | In Pipeline 2 |
| |\ +------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| | +Filter | cache[p.born] = $autoint_2 | 1 |
0 | 0 | | | | |
| | | +------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Expand(All) | (c)<-[anon_1:LIVES_IN]-(p) | 1 |
0 | 0 | | | | |
| | | +------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Filter | c:City | 1 |
0 | 0 | | | | |
| | | +------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Expand(All) | (cc)<-[anon_2:PART_OF]-(c) | 1 |
0 | 0 | | | | |
| | | +------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +NodeIndexSeek | RANGE INDEX cc:Country(formed) WHERE formed = $autoint_3 | 1 |
0 | 0 | 120 | 0/0 | 0.000 | Fused in Pipeline 1 |
| | +------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +Filter | i.year = $autoint_1 AND cache[p.born] = $autoint_2 AND p:Pioneer | 0 |
0 | 1 | | | | |
| | +------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Expand(All) | (sc)<-[i:INVENTED_BY]-(p) | 1 |
1 | 3 | | | | |
| | +------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Filter | sc:Science | 1 |
1 | 2 | | | | |
| | +------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Expand(All) | (s)-[anon_0:RESEARCHED]->(sc) | 1 |
1 | 2 | | | | |
| | +------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX s:Scientist(born) WHERE born = $autoint_0 | 1 |
1 | 2 | 120 | 6/1 | 0.515 | Fused in Pipeline 0 |
+------------------+------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 10, total allocated memory: 768

Hinting a join for an OPTIONAL MATCH

A join hint can also be used to force the planner to pick a NodeLeftOuterHashJoin or
NodeRightOuterHashJoin to solve an OPTIONAL MATCH. In most cases, the planner will rather use an

601
OptionalExpand.

Query

PROFILE
MATCH (s:Scientist {born: 1850})
OPTIONAL MATCH (s)-[:RESEARCHED]->(sc:Science)
RETURN *

Without any hint, the planner did not use a join to solve the OPTIONAL MATCH.

Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+----------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | s, sc | 1 | 1 |
0 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +OptionalExpand(All) | (s)-[anon_0:RESEARCHED]->(sc) WHERE sc:Science | 1 | 1 |
4 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX s:Scientist(born) WHERE born = $autoint_0 | 1 | 1 |
2 | 120 | 6/0 | 0.560 | Fused in Pipeline 0 |
+----------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 6, total allocated memory: 184

Query

PROFILE
MATCH (s:Scientist {born: 1850})
OPTIONAL MATCH (s)-[:RESEARCHED]->(sc:Science)
USING JOIN ON s
RETURN *

Now the planner uses a join to solve the OPTIONAL MATCH.

602
Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | s, sc | 1 | 1 |
0 | | 2/0 | 0.213 | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+ |
| +NodeLeftOuterHashJoin | s | 1 | 1 |
0 | 3112 | | 0.650 | In Pipeline 2 |
| |\ +-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| | +Expand(All) | (sc)<-[anon_0:RESEARCHED]-(s) | 100 | 100 |
300 | | | | |
| | | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| | +NodeByLabelScan | sc:Science | 100 | 100 |
101 | 120 | 4/0 | 0.786 | Fused in Pipeline 1 |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +NodeIndexSeek | RANGE INDEX s:Scientist(born) WHERE born = $autoint_0 | 1 | 1 |
2 | 120 | 1/0 | 0.214 | In Pipeline 0 |
+------------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 403, total allocated memory: 3192

Semantic indexes
Unlike search-performance indexes, semantic indexes capture the semantic meaning or context of the data
in a database. This is done by returning an approximation score, which indicates the similarity between a
query string and the data in a database.

Two semantic indexes are available in Neo4j:

• Full-text indexes: enables searching within the content of STRING properties and for similarity
comparisons between query strings and STRING values stored in the database.

• Vector indexes: enables similarity searches and complex analytical queries by representing nodes or
properties as vectors in a multidimensional space.

Unlike search-performance indexes, semantic indexes are not automatically used by the

 Cypher planner. To use semantic indexes, they must be explicitly called with specific
procedures.

Full-text indexes
A full-text index is used to index nodes and relationships by STRING properties. Unlike range and text
indexes, which can only perform limited STRING matching (exact, prefix, substring, or suffix matches), full-

603
text indexes stores individual words in any given STRING property. This means that full-text indexes can be
used to match within the content of a STRING property. Full-text indexes also return a score of proximity
between a given query string and the STRING values stored in the database, thus enabling them to
semantically interpret data.

Full-text indexes are powered by the Apache Lucene indexing and search library.

Example graph
The following graph is used for the examples below:

EMAILED
Employee Employee
message: 'I have booked a team meeting tomorrow'

name: 'Maya Tanaka' name: 'Nils Johansson'


position: 'Senior Engineer' position: 'Engineer'
team: 'Operations' team: 'Operations'

REVIEWED
Manager Employee
message: 'Nils-Erik is reportedly difficult to work with.'

name: 'Lisa Danielsson' name: 'Nils-Erik Karlsson'


position: 'Engineering manager' position: 'Engineer'
team: 'Kernel'
peerReviews: ['Nils-Erik is difficult to work with.', 'Nils-Erik is often late for work.']

To recreate it, run the following query against an empty Neo4j database:

CREATE (nilsE:Employee {name: "Nils-Erik Karlsson", position: "Engineer", team: "Kernel", peerReviews:
['Nils-Erik is difficult to work with.', 'Nils-Erik is often late for work.']}),
(lisa:Manager {name: "Lisa Danielsson", position: "Engineering manager"}),
(nils:Employee {name: "Nils Johansson", position: "Engineer", team: "Operations"}),
(maya:Employee {name: "Maya Tanaka", position: "Senior Engineer", team:"Operations"}),
(lisa)-[:REVIEWED {message: "Nils-Erik is reportedly difficult to work with."}]->(nilsE),
(maya)-[:EMAILED {message: "I have booked a team meeting tomorrow."}]->(nils)

Create full-text indexes


Full-text indexes are created with the CREATE FULLTEXT INDEX command. It is recommended to to give the
index a name when it is created. If no name is given when created, a random name will be assigned to the
full-text index.

The CREATE FULLTEXT INDEX command is optionally idempotent. This mean that its default behavior is to
throw an error if an attempt is made to create the same index twice. If IF NOT EXISTS is appended to the
command, no error is thrown and nothing happens should an index with the same name or a full-text
index on the same schema already exist. As of Neo4j 5.17, an informational notification is instead returned
showing the existing index which blocks the creation. As of Neo4j 5.16, the index name can also be given
as a parameter, CREATE FULLTEXT INDEX $name FOR ….

 Creating a full-text index requires the CREATE INDEX privilege.

604
When creating a full-text index, you need to specify the labels/relationship types and property names it
should apply to.

This statement creates a full-text index named namesAndTeams on each name and team property for nodes
with the label Employee or Manager:

Create a full-text index on a node label and property combination

CREATE FULLTEXT INDEX namesAndTeams FOR (n:Employee|Manager) ON EACH [n.name, n.team]

This query highlights two key differences between full-text and search-performance indexes:

• Full-text indexes can be applied to more than one node label.

• Full-text indexes can be applied to more than one property, but unlike composite search-performance
indexes, a full-text index stores entities that have at least one of the indexed labels or relationship
types, and at least one of the indexed properties.

Similarly, though a relationship can have only one type, a full-text index can store multiple relationship
types. In that case, all types matching at least one of the relationship types and at least one of the indexed
properties will be included.

This statement creates a full-text index named communications on the message property for the relationship
types REVIEWED and EMAILED:

Create a full-text index on a relationship type and property combination

CREATE FULLTEXT INDEX communications FOR ()-[r:REVIEWED|EMAILED]-() ON EACH [r.message]

Tokenization and analyzers

Full-text indexes store individual words in a STRING property. This is achieved by tokenizer, which breaks
up a stream of characters into individual tokens (usually individual words). How a STRING is tokenized is
determined by what analyzer the full-text index is configured with. The default analyzer (standard-no-
stop-words) analyzes both the indexed values and the query string.

Stop words are common words in a language that can be filtered out during information
retrieval tasks since they are considered to be of little use when determining the
meaning of a string. These words are typically short and frequently used across various
contexts.

 For example, the following stop words are included in Lucene’s english analyzer: "a",
"an", "and", "are", "as", "at", "be", "but”, and so on.

Removing stop words can help reduce the size of stored data and thereby improve the
efficiency of data retrieval.

In some cases, using different analyzers for the indexed values and query string is more appropriate. For
example, if handling STRING values written in Swedish, it may be beneficial to select the swedish analyzer,

605
which knows how to tokenize Swedish words, and will avoid indexing Swedish stop words.

A complete list of all available analyzers is included in the result of the


db.index.fulltext.listAvailableAnalyzers procedure.

Neo4j also supports the use of custom analyzers. For more information, see the Java Reference Manual →
Full-text index analyzer providers.

Configuration settings

The CREATE FULLTEXT INDEX command takes an optional OPTIONS clause, where the indexConfig can be
specified. The following statement creates a full-text index using a parameter for nodes with the label
Employee or Manager. Creating and dropping indexes using parameters was introduced in Neo4j 5.16.

Parameters

{
"name": "peerReviews"
}

Create a full-text index using OPTIONS

CREATE FULLTEXT INDEX $name FOR (n:Employee|Manager) ON EACH [n.peerReviews]


OPTIONS {
indexConfig: {
`fulltext.analyzer`: 'english', ①
`fulltext.eventually_consistent`: true ②
}
}

① The fulltext.analyzer setting can be used to configure an index-specific analyzer. In this case, it is set
to the english analyzer. The possible values for the fulltext.analyzer setting can be listed with the
db.index.fulltext.listAvailableAnalyzers procedure.

② The fulltext.eventually_consistent setting, if set to true, will put the index in an eventually
consistent update mode. This means that updates will be applied in a background thread "as soon as
possible", instead of during a transaction commit, which is true for other indexes.

For more information on how to configure full-text indexes, refer to the Operations Manual → Indexes to
support full-text search.

Query full-text indexes


To query a full-text index, use either the db.index.fulltext.queryRelationships procedure.

Unlike other search-performance indexes, full-text indexes are not automatically used by

 the Cypher query planner. To access full-text indexes, they must be explicitly called with
the above-mentioned procedures.

This query uses the db.index.fulltext.queryNodes to look for nils in the previously created full-text
index namesAndTeams:

606
Query a full-text index for a node property

CALL db.index.fulltext.queryNodes("namesAndTeams", "nils") YIELD node, score


RETURN node.name, score

Result

node.name score

"Nils Johansson" 0.3300700783729553

"Nils-Erik Karlsson" 0.27725890278816223

Rows: 2

Many full-text index analyzers (including Neo4j’s default analyzer) normalize tokens to

 lower case. Full-text indexes are therefore case-insensitive by default when used on
Neo4j.

The score column represents how well the index thinks that the entry matches the given query string.
Thus, in addition to any exact matches, full-text indexes return approximate matches to a given query
string. This is possible because both the property values that are indexed, and the queries to the index, are
processed through the analyzer such that the index can find data entities which do not exactly match the
provided STRING.

The score results are always returned in descending score order, where the best matching result entry is
put first.

This query uses the db.index.fulltext.queryRelationships to query the previously created


communications full-text index for any message containing "meeting":

Query a full-text index for a relationship property

CALL db.index.fulltext.queryRelationships("communications", "meeting") YIELD relationship, score


RETURN type(relationship), relationship.message, score

Result

type(relationship) relationship.message score

"EMAILED" "I have booked a team meeting 0.3239005506038666


tomorrow."

Rows: 1

To only obtain exact matches, quote the STRING you are searching for:

Query a full-text index for exact matches

CALL db.index.fulltext.queryNodes("namesAndTeams", '"Nils-Erik"') YIELD node, score


RETURN node.name, score

Result

607
node.name score

"Nils-Erik Karlsson" 0.7588480710983276

Rows: 1

Query strings also support the use of the Lucene boolean operators (AND, OR, NOT, +, -):

Query a full-text index using logical operators

CALL db.index.fulltext.queryNodes("namesAndTeams", 'nils AND kernel') YIELD node, score


RETURN node.name, node.team, score

Result

node.name node.team score

"Nils-Erik Karlsson" "Kernel" 0.723090410232544

Rows: 1

It is possible to limit the search to specific properties, by prefixing <propertyName>: to the query string.

Query a full-text index for specific properties

CALL db.index.fulltext.queryNodes("namesAndTeams", 'team:"Operations"') YIELD node, score


RETURN node.name, node.team, score

Result

node.name node.team score

"Nils Johansson" "Operations" 0.21363800764083862

"Maya Tanaka" "Operations" 0.21363800764083862

Rows: 2

A complete description of the Lucene query syntax can be found in the Lucene documentation.

Lists of STRING values

If the indexed property contains a list of STRING values, each entry is analyzed independently and all
produced tokens are associated to the same property name. This means that when querying such an
indexed node or relationship, there is a match if any of the list elements matches the query string. For
scoring purposes, the full-text index treats it as a single-property value, and the score will represent how
close the query is to matching the entire list.

Query a full-text index for content present in a list of STRING properties

CALL db.index.fulltext.queryNodes('peerReviews', 'late') YIELD node, score


RETURN node.name, node.peerReviews, score

Result

608
node.name node.peerReviews score

"Nils-Erik Karlsson" ["Nils-Erik is difficult to work 0.13076457381248474


with.", "Nils-Erik is often late for
work."]

Show full-text indexes


To list all full-text indexes in a database, use the SHOW FULLTEXT INDEXES command:

Show all full-text indexes in a database

SHOW FULLTEXT INDEXES

Result

+---------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------+
| id | name | state | populationPercent | type | entityType | labelsOrTypes
| properties | indexProvider | owningConstraint | lastRead | readCount |
+---------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------+
| 4 | "communications" | "ONLINE" | 100.0 | "FULLTEXT" | "RELATIONSHIP" | ["REVIEWED",
"EMAILED"] | ["message"] | "fulltext-1.0" | NULL | 2023-10-31T15:06:10.270Z | 2 |
| 3 | "namesAndTeams" | "ONLINE" | 100.0 | "FULLTEXT" | "NODE" | ["Employee",
"Manager"] | ["name", "team"] | "fulltext-1.0" | NULL | 2023-10-31T15:07:48.874Z | 5 |
| 6 | "peerReviews" | "ONLINE" | 100.0 | "FULLTEXT" | "NODE" | ["Employee",
"Manager"] | ["peerReviews"] | "fulltext-1.0" | NULL | 2023-10-31T15:09:05.391Z | 3 |
+---------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------+

Similar to search-performance indexes, the SHOW command can be filtered for particular columns:

Show full-text indexes using filtering

SHOW FULLTEXT INDEXES WHERE name CONTAINS "Team"

Result

+---------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------+
| id | name | state | populationPercent | type | entityType | labelsOrTypes
| properties | indexProvider | owningConstraint | lastRead | readCount |
+---------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------+
| 5 | "namesAndTeams" | "ONLINE" | 100.0 | "FULLTEXT" | "NODE" | ["Employee", "Manager"]
| ["name", "team"] | "fulltext-1.0" | NULL | NULL | 0 |
+---------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------+

To return full index details, use the YIELD clause. For example:

Show all full-text indexes and all return columns

SHOW FULLTEXT INDEXES YIELD *

609
Result

+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------+
| id | name | state | populationPercent | type | entityType | labelsOrTypes
| properties | indexProvider | owningConstraint | lastRead | readCount | trackedSince |
options
| failureMessage | createStatement
|
+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------+
| 4 | "communications" | "ONLINE" | 100.0 | "FULLTEXT" | "RELATIONSHIP" | ["REVIEWED",
"EMAILED"] | ["message"] | "fulltext-1.0" | NULL | NULL | 0 | 2023-11-
01T09:27:57.024Z | {indexConfig: {`fulltext.analyzer`: "standard-no-stop-words",
`fulltext.eventually_consistent`: FALSE}, indexProvider: "fulltext-1.0"} | "" | "CREATE
FULLTEXT INDEX `communications` FOR ()-[r:`REVIEWED`|`EMAILED`]-() ON EACH [r.`message`] OPTIONS
{indexConfig: {`fulltext.analyzer`: 'standard-no-stop-words',`fulltext.eventually_consistent`: false},
indexProvider: 'fulltext-1.0'}" |
| 5 | "namesAndTeams" | "ONLINE" | 100.0 | "FULLTEXT" | "NODE" | ["Employee",
"Manager"] | ["name", "team"] | "fulltext-1.0" | NULL | NULL | 0 | 2023-11-
01T12:24:48.002Z | {indexConfig: {`fulltext.analyzer`: "standard-no-stop-words",
`fulltext.eventually_consistent`: FALSE}, indexProvider: "fulltext-1.0"} | "" | "CREATE
FULLTEXT INDEX `namesAndTeams` FOR (n:`Employee`|`Manager`) ON EACH [n.`name`, n.`team`] OPTIONS
{indexConfig: {`fulltext.analyzer`: 'standard-no-stop-words',`fulltext.eventually_consistent`: false},
indexProvider: 'fulltext-1.0'}" |
| 6 | "peerReviews" | "ONLINE" | 100.0 | "FULLTEXT" | "NODE" | ["Employee",
"Manager"] | ["peerReviews"] | "fulltext-1.0" | NULL | NULL | 0 | 2023-11-
01T12:25:41.495Z | {indexConfig: {`fulltext.analyzer`: "english", `fulltext.eventually_consistent`: TRUE},
indexProvider: "fulltext-1.0"} | "" | "CREATE FULLTEXT INDEX `peerReviews` FOR
(n:`Employee`|`Manager`) ON EACH [n.`peerReviews`] OPTIONS {indexConfig: {`fulltext.analyzer`:
'english',`fulltext.eventually_consistent`: true}, indexProvider: 'fulltext-1.0'}" |
+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------+

For a full description of all return columns, see Search-performance indexes → Result columns for listing
indexes.

Drop full-text indexes


A full-text node index is dropped by using the same command as for other indexes, DROP INDEX.

In the following example, the previously created communications full-text index is deleted from the
database:

Drop a full-text index

DROP INDEX communications

As of Neo4j 5.16, the index name can also be given as a parameter when dropping an index: DROP INDEX
$name.

610
List of full-text index procedures
The procedures for full-text indexes are listed in the table below:

Usage Procedure/Command Description

Eventually consistent indexes. db.index.fulltext.awaitEventuallyCon Wait for the updates from recently
sistentIndexRefresh committed transactions to be applied to
any eventually-consistent full-text
indexes.

List available analyzers. db.index.fulltext.listAvailableAnaly List the available analyzers that the full-
zers text indexes can be configured with.

Use full-text node index. db.index.fulltext.queryNodes Query the given full-text index. Returns
the matching nodes and their Lucene
query score, ordered by score.

Use full-text relationship index. db.index.fulltext.queryRelationships Query the given full-text index. Returns
the matching relationships and their
Lucene query score, ordered by score.

Summary
• Full-text indexes support the indexing of both nodes and relationships.

• Full-text indexes include only property values of types STRING or LIST<STRING>.

• Full-text indexes are accessed via Cypher procedures.

• Full-text indexes return the score for each result from a query.

• Full-text indexes support configuring custom analyzers, including analyzers that are not included with
Lucene itself.

• Full-text indexes can be queried using the Lucene query language.

• Full-text indexes are kept up to date automatically, as nodes and relationships are added, removed,
and modified.

• Full-text indexes will automatically populate newly created indexes with the existing data in a store.

• Full-text indexes can be checked by the consistency checker, and they can be rebuilt if there is a
problem with them.

• Newly created full-text indexes get automatically populated with the existing data in the database.

• Full-text indexes can support any number of properties in a single index.

• Full-text indexes are created, dropped, and updated transactionally, and are automatically replicated
throughout a cluster.

• Full-text indexes can be configured to be eventually consistent, in which index updating is moved from
the commit path to a background thread. Using this feature, it is possible to work around the slow
Lucene writes from the performance-critical commit process, thus removing the main bottlenecks for
Neo4j write performance.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/llm-vectors-
unstructured/ad.adoc

611
Vector indexes
Node vector indexes were released as a public beta in Neo4j 5.11 and general availability in Neo4j 5.13.

Vector indexes enable similarity searches and complex analytical queries by representing nodes or
properties as vectors in a multidimensional space.

The following resources provide hands-on tutorials for working with LLMs and vector indexes in Neo4j:

• GraphAcademy → LLM & Neo4j Fundamentals

• GraphAcademy → Introduction to Vector Indexes and Unstructured Data

• GenAI documentation → Embeddings & Vector Indexes Tutorial

[11]
Neo4j vector indexes are powered by the Apache Lucene indexing and search library.

Example graph
The examples on this page use the Neo4j movie recommendations dataset, focusing on the plot and
embedding properties of Movie nodes. The embedding property consists of a 1536-dimension vector
embedding of the plot and title property combined.

The graph contains 28863 nodes and 332522 relationships.

To recreate the graph, download and import this dump file to an empty Neo4j database (running version
5.13 or later). Dump files can be imported for both on-prem instances.

The dump file used to load the dataset contains embeddings generated by OpenAI,
 using the model text-embedding-ada-002.

Vectors and embeddings in Neo4j


Vector indexes allow you to query vector embeddings from large datasets. An embedding is a numerical
representation of a data object, such as a text, image, or document. Each word or token in a text is typically
represented as high-dimensional vector where each dimension represents a certain aspect of the word’s
meaning.

The embedding for a particular data object can be created by both proprietary (such as Vertex AI or
OpenAI) and open source (such as sentence-transformers) embedding generators, which can produce
vector embeddings with dimensions such as 256, 768, 1536, and 3072. In Neo4j, vector embeddings are

612
stored as LIST<INTEGER | FLOAT> properties on a node or relationship.

For information about how embeddings can be generated and stored as properties, see:

 • GenAI integrations

• GenAI documentation → Embeddings & Vector Indexes Tutorial

For example, the movie The Godfather, has the following plot: "The aging patriarch of an organized
crime dynasty transfers control of his clandestine empire to his reluctant son." This is its 1536-
dimensional embedding property, where each element in the LIST represents a particular aspect of the
plot’s meaning:

[0.005967312026768923, -0.03817005082964897, 0.0014667075593024492, -0.03868866711854935,


-0.006505374796688557, 0.020900176838040352, -0.0027551413513720036, -0.0024731445591896772,
-0.03734026849269867, -0.02228747308254242, 0.028783122077584267, 0.017905177548527718,
0.011396560817956924, 0.014235977083444595, 0.023143187165260315, -0.014184115454554558,
0.029846282675862312, -0.011928141117095947, 0.018838683143258095, -0.0019172541797161102,
0.0033483069855719805, 0.009497134014964104, -0.03516208380460739, 0.0021441481076180935,
0.002657901030033827, 0.0030760341323912144, 0.004255882930010557, -0.020809419453144073,
0.02358401007950306, -0.013808120042085648, 0.01064456906169653, -0.006975369527935982,
0.007318951655179262, -0.013872946612536907, 0.005905726458877325, -0.010689947754144669,
0.0020225979387760162, -0.016245609149336815, -0.00038815077277831733, -0.007163367234170437,
0.027668101713061333, 0.007215228863060474, -0.009380445815622807, -0.02956104464828968,
-0.000863007502630353, 0.012142069637775421, 0.0012957267463207245, -0.027953339740633965,
-0.016414159908890724, 0.008453421294689178, -0.0010777463903650641, 0.03311355784535408,
-0.013639570213854313, -0.052457891404628754, 0.0010242642601951957, 0.0034390646032989025,
-0.01049546804279089, 0.006456754636019468, 0.003970644902437925, -0.011629937216639519,
0.005280147306621075, -0.023402493447065353, -0.014689764939248562, -0.007623638026416302,
-0.002453696448355913, 0.02290981076657772, 0.0017989451298490167, 0.0013427261728793383,
-0.001776255783624947, -0.002414800226688385, 0.04833490028977394, 0.031142819672822952,
-0.0033013075590133667, 0.017879245802760124, 0.0070077828131616116, -0.016154851764440536,
-0.005772831384092569, 0.019875913858413696, -0.018008900806307793, 0.012764407321810722,
0.0055232481099665165, -0.027901478111743927, -0.0034909259993582964, 0.0307279285043478,
0.006472961511462927, 0.008861830458045006, -0.01802186481654644, 0.018281172960996628,
-0.014223011210560799, -0.00018313586770091206, 0.0026352116838097572, 0.0006754148053005338,
0.014975002966821194, 0.024361930787563324, -0.017166150733828545, 0.0028880364261567593,
0.011824417859315872, 0.01710132323205471, -0.0005003822734579444, -0.018890544772148132,
-0.002192768268287182, -0.0018264965619891882, 0.011033530347049236, -0.009095207788050175,
-0.022689398378133774, -0.004281813744455576, 0.007092057727277279, -0.015247276052832603,
0.024115590378642082, 0.002996621420606971, -0.02834230102598667, 0.030546413734555244,
0.02350621670484543, -0.020511215552687645, 0.010190781205892563, -0.016582708805799484,
0.028238577768206596, -0.011967036873102188, 0.011623455211520195, -0.02797926962375641,
0.0026254875119775534, 0.018307102844119072, 0.0038701631128787994, -0.03850715234875679,
0.006246067117899656, -0.0006312514888122678, 0.010352848097682, -0.02358401007950306,
-0.026708664372563362, -0.002863726345822215, 0.035862214863300323, 0.009860164485871792,
-0.01726987399160862, 0.004275330808013678, -0.02663087099790573, 0.009140586480498314,
-0.013872946612536907, 0.019136887043714523, -0.020835351198911667, -0.0250879917293787,
0.03044269047677517, 0.026280807331204414, -0.013406192883849144, 0.006683648563921452,
-0.01216800045222044, 0.007824601605534554, 0.031505849212408066, 0.023726629093289375,
0.0294832531362772, -0.013678465969860554, 0.033891480416059494, 0.009211895987391472,
0.017088359221816063, -0.02183368429541588, 0.01847565360367298, 0.004644844215363264,
-0.009834233671426773, -0.011344699189066887, -0.0006725785788148642, 0.00012691882147919387,
0.015338033437728882, 0.025736261159181595, -0.003967403434216976, -0.007312469184398651,
-0.01312743779271841, 0.02350621670484543, -0.0006843284936621785, -0.011785522103309631,
0.006570201832801104, -0.004187814891338348, -0.0070013003423810005, 0.0165178831666708,
-0.004537879955023527, 0.022715330123901367, -0.0025120405480265617, 0.025580676272511482,
0.005053253378719091, -0.0020063910633325577, -0.039285074919462204, -0.001816772622987628,
0.0007224142318591475, 0.0161029901355505, 0.04086684808135033, 0.03536953032016754, 0.009626788087189198,
-0.023571044206619263, -0.009607339277863503, 0.011085391975939274, 0.020835351198911667,
-0.0009027139167301357, -0.007584741804748774, 0.016958704218268394, 0.011130770668387413,
-0.016829051077365875, -0.6712950468063354, -0.006511857267469168, -0.024854615330696106,
-0.02663087099790573, -0.00008933950448408723, 0.0061779990792274475, 0.004605947993695736,
0.013231161050498486, -0.020187081769108772, 0.00798666849732399, -0.001847565290518105,
0.04086684808135033, 0.007519915234297514, 0.0040808506309986115, -0.034021131694316864,
-0.01997963711619377, -0.004972219467163086, -0.023220978677272797, 0.012129104696214199,
0.0018329792656004429, -0.011649386025965214, 0.028446022421121597, -0.0010356089333072305,
-0.006223377771675587, 0.021211346611380577, 0.004006299655884504, 0.021937407553195953,
-0.02927580662071705, -0.01129283756017685, -0.009296170435845852, -0.01864420250058174,
0.02717541716992855, -0.0003555347793735564, 0.0021700789220631123, 0.048360832035541534,
-0.002277043182402849, -0.009049829095602036, 0.033969271928071976, 0.004557327833026648,
0.018916476517915726, -0.000779542897362262, -0.00638544512912631, 0.022183749824762344,

613
-0.012757924385368824, -0.027149485424160957, -0.012278205715119839, 0.0238303504884243,
-0.02963883802294731, 0.005218561738729477, -0.004434156697243452, 0.013665501028299332,
-0.0024520757142454386, 0.002124700229614973, -0.007273572962731123, -0.0035654769744724035,
-0.0028621056117117405, 0.020640870556235313, 0.01091684214770794, -0.0006867594784125686,
-0.011694764718413353, 0.011215046048164368, 0.016504917293787003, 0.00827838946133852,
-0.0044471221044659615, 0.010676982812583447, 0.027771824970841408, -0.0133802630007267,
0.029820352792739868, 0.008349698968231678, -0.014573076739907265, -0.009017415344715118,
0.011655868031084538, -0.0061066895723342896, -0.013082059100270271, 0.004353123251348734,
0.00672254478558898, 0.01773662678897381, 0.012433790601789951, 0.023843316361308098,
0.015221345238387585, -0.0046221548691391945, -0.00026214358513243496, -0.016582708805799484,
0.016504917293787003, 0.028005201369524002, 0.005516765173524618, -0.04309689253568649,
0.013743292540311813, -0.0064308238215744495, -0.007176332641392946, 0.01911095716059208,
0.00446332897990942, -0.012971853837370872, -0.016919808462262154, 0.010048162192106247,
0.0032769974786788225, -0.021548446267843246, 0.001816772622987628, 0.01856641098856926,
-0.04804966226220131, 0.007286538369953632, -0.007299503777176142, -0.014080392196774483,
0.008952588774263859, 0.023908143863081932, 0.012932957150042057, -0.008433973416686058,
0.012783855199813843, 0.0430709607899189, -0.01015836838632822, 0.03534360229969025,
-0.007584741804748774, -0.016453055664896965, -0.005720969755202532, -0.014871280640363693,
-0.026540113613009453, 0.005228285677731037, 0.0004019264888484031, 0.005931657273322344,
-0.02533433400094509, -0.018825719133019447, 0.0023353875149041414, 0.0014059323584660888,
-0.02020004764199257, 0.022481953725218773, 0.034980569034814835, -0.02709762565791607,
-0.022974636405706406, -0.025023166090250015, 0.00641785841435194, -0.00019822835747618228,
-0.004845807328820229, 0.0003723492263816297, -0.010132437571883202, 0.01498796883970499,
0.001948046963661909, -0.0020161152351647615, -0.008842382580041885, 0.0223652645945549,
-0.013574742712080479, -0.002369421534240246, 0.003275376744568348, 0.005879795644432306,
0.005789037793874741, 0.006359514314681292, -0.03549918532371521, 0.003118171589449048,
-0.026993902400135994, -0.01614188589155674, 0.011578075587749481, 0.0008524731383658946,
-0.013367297127842903, 0.004194297362118959, 0.019331367686390877, 0.006152068264782429,
-0.015208380296826363, -0.0018005658639594913, -0.015714028850197792, -0.01681608520448208,
-0.028990568593144417, 0.010676982812583447, 0.024595309048891068, -0.045560311526060104,
-0.0009262136882171035, 0.014845349825918674, -0.020887212827801704, 0.015739960595965385,
0.011727177537977695, 0.0012560202740132809, -0.023052429780364037, 0.0014245701022446156,
-0.013062611222267151, -0.011299320496618748, 0.022274507209658623, 0.011338216252624989,
-0.007908876053988934, 0.010339883156120777, -0.006132620386779308, 0.01247916929423809,
-0.007947771809995174, -0.0025347298942506313, -0.011416008695960045, 0.011027047410607338,
0.004521673079580069, 0.04880165681242943, 0.0012543996563181281, 0.02115948498249054, 0.0165178831666708,
-0.025373229756951332, 0.026125222444534302, -0.0031262750271707773, 0.007669016718864441,
0.003821542952209711, -0.021561412140727043, 0.008187631145119667, 0.02358401007950306,
0.02249491773545742, 0.015247276052832603, -0.004560569301247597, 0.030753860250115395,
0.031090958043932915, -0.021457688882946968, 0.027694031596183777, -0.004823117982596159,
0.0049171168357133865, -0.018346000462770462, -0.0030355174094438553, -0.011176149360835552,
0.024102624505758286, 0.006923507899045944, 0.010009266436100006, -0.00510187353938818,
0.0007916979375295341, -0.004722636193037033, 0.019914809614419937, 0.026190048083662987,
-0.013289504684507847, 0.006346548907458782, -0.015415825881063938, -0.026734594255685806,
0.003623821074143052, 0.005325525999069214, -0.003922024741768837, -0.00640813447535038,
-0.014624938368797302, -0.0065021333284676075, 0.007435640320181847, -0.002808623481541872,
0.010138919577002525, -0.033813685178756714, -0.0032008260022848845, 0.01614188589155674,
-0.018994268029928207, 0.008135770447552204, -0.008596041239798069, -0.015662167221307755,
0.004310985561460257, -0.014663834124803543, 0.014962038025259972, -0.03479905426502228,
0.013114472851157188, 0.01341915875673294, 0.05092797800898552, -0.011908693239092827,
0.005332008935511112, -0.013367297127842903, 0.02501020021736622, -0.00029678543796762824,
-0.02454344742000103, 0.003152205841615796, -0.015454721637070179, 0.010028714314103127,
-0.02102983184158802, -0.0032624113373458385, 0.03583628311753273, -0.015026864595711231,
0.00672254478558898, 0.000010907877367571928, 0.019875913858413696, 0.020161151885986328,
0.014054462313652039, -0.005675591062754393, -0.009224860928952694, 0.014793488197028637,
0.03687351569533348, -0.005442214198410511, 0.005633453372865915, -0.0030436208471655846,
-0.012615305371582508, -0.009075759910047054, 0.017192082479596138, -0.002220319816842675,
0.005798762198537588, -0.0007568534929305315, 0.010378778912127018, 0.005908967927098274,
-0.0158825796097517, 0.0088812792673707, 0.007766257040202618, -0.0030209312681108713,
-0.013561777770519257, -0.035395462065935135, 0.022391194477677345, -0.0027049004565924406,
0.004748567007482052, -0.020433424040675163, -0.00028706141165457666, -0.005092149134725332,
-0.018371930345892906, 0.006009449250996113, -0.00645027169957757, 0.015286171808838844,
-0.012343033216893673, -0.008628454059362411, -0.010605673305690289, 0.009192448109388351,
0.007500466890633106, -0.013535846956074238, 0.003831267124041915, -0.02956104464828968,
0.0009724028059281409, 0.0034585127141326666, -0.00004074468961334787, -0.025139853358268738,
0.012278205715119839, 0.023519182577729225, -0.012913509272038937, -0.006301170215010643,
0.0037178201600909233, 0.004716153722256422, -0.017905177548527718, 0.009769407100975513,
-0.019746258854866028, -0.011675315909087658, 0.007409709505736828, -0.022676432505249977,
-0.013406192883849144, 0.003922024741768837, 0.03925914317369461, -0.011325251311063766,
-0.014611972495913506, -0.022404160350561142, -0.03311355784535408, 0.0024634203873574734,
0.1057974249124527, 0.014145219698548317, 0.025956671684980392, 0.006878129206597805,
-0.019914809614419937, -0.019162818789482117, -0.009231343865394592, -0.04423784464597702,
0.012018898501992226, -0.00921837892383337, 0.02408965863287449, -0.026501217857003212,
0.020225977525115013, 0.005014357157051563, 0.02053714729845524, 0.014521215111017227,
-0.002670866437256336, -0.020433424040675163, -0.0015372068155556917, -0.031168751418590546,
0.0051213214173913, 0.006865163799375296, 0.010048162192106247, 0.003795612370595336,
-0.009749959222972393, -0.024063728749752045, 0.026449356228113174, 0.00967864878475666,
-0.009049829095602036, -0.012284688651561737, -0.02475089207291603, 0.0034844432957470417,
-0.00928320549428463, 0.011772556230425835, -0.01811262220144272, -0.01918874867260456,
0.009043346159160137, 0.023843316361308098, 0.02580108679831028, 0.005980277433991432,

614
0.029327668249607086, -0.008103356696665287, 0.008083908818662167, -0.005490834359079599,
0.021146519109606743, -0.0023499734234064817, -0.03298390284180641, 0.005283388774842024,
-0.00043352958164177835, -0.024271173402667046, 0.03181701898574829, -0.000028944177756784484,
-0.004479535389691591, -0.002066355897113681, 0.017995934933423996, -0.012783855199813843,
0.013859981670975685, -0.006615580525249243, -0.0008403180981986225, 0.025489918887615204,
-0.01789221167564392, -0.03189481049776077, 0.00028949242550879717, -0.03251715004444122,
0.03588814660906792, -0.03500650078058243, -0.007869980297982693, -0.024361930787563324,
0.00451519014313817, -0.018177449703216553, 0.020627904683351517, 0.003249445930123329,
0.010962220840156078, -0.005299595184624195, 0.048023734241724014, -0.0033094107639044523,
0.012971853837370872, -0.02290981076657772, 0.017918141558766365, -0.016245609149336815,
-0.013179299421608448, -0.020589008927345276, 0.0037469922099262476, -0.029327668249607086,
-0.007383778691291809, 0.013017232529819012, 0.006327101029455662, -0.02689017914235592,
-0.004385536536574364, 0.005789037793874741, -0.005597798619419336, -0.004152160137891769,
0.012719028629362583, -0.008220044896006584, -0.01702353172004223, -0.011506766080856323,
0.0042980206198990345, 0.0018702547531574965, -0.0032964455895125866, 0.007267090491950512,
-0.009581409394741058, -0.0058182100765407085, -0.005429248791188002, -0.008829417638480663,
-0.0030403793789446354, -0.01194110605865717, -0.002591453492641449, 0.005756624508649111,
-0.01618078351020813, -0.009821268729865551, -0.00021210535487625748, -0.01768476516008377,
-0.0005562954465858638, -0.017451388761401176, -0.015545479021966457, 0.02332470193505287,
0.015960371121764183, 0.02208002656698227, 0.01369143184274435, -0.014495284296572208,
-0.007701430004090071, -0.0005567006301134825, 0.027590308338403702, 0.05188741534948349,
-0.023609939962625504, -0.017957039177417755, 0.015999266877770424, -0.020900176838040352,
0.003038758644834161, 0.021042795851826668, -0.009814785793423653, 0.0014083633432164788,
0.010897394269704819, -0.0167253278195858, -0.020135220140218735, -0.005273664370179176,
-0.009788854978978634, -0.002986897248774767, -0.008764590136706829, -0.006729027256369591,
-0.018449721857905388, -0.009166517294943333, -0.002651418326422572, 0.008245975710451603,
0.0034358231350779533, -0.028757192194461823, 0.01511762198060751, -0.008544179610908031,
0.005344973877072334, 0.013924808241426945, -0.003299686824902892, -0.04143732413649559,
-0.0008403180981986225, 0.010949255898594856, -0.013600673526525497, -0.03448788449168205,
-0.007863497361540794, -0.01809965819120407, -0.00444063963368535, 0.004920358303934336,
0.0330357663333416, -0.008816451765596867, 0.006683648563921452, 0.00823301076889038,
-0.015947405248880386, 0.02608632668852806, 0.0037243026308715343, -0.007623638026416302,
-0.028031131252646446, 0.027123555541038513, 0.01843675784766674, 0.016712361946702003,
0.040374163538217545, -0.0021538722794502974, 0.01885164901614189, -0.011740143410861492,
0.017490284517407417, -0.0004517621418926865, -0.00034439266892150044, -0.026190048083662987,
-0.021729961037635803, 0.0020209772046655416, -0.014521215111017227, -0.01467679999768734,
0.002505557844415307, -0.01061863824725151, 0.015623271465301514, -0.010087057948112488,
-0.0031748951878398657, 0.01631043665111065, 0.016375262290239334, -0.013257091864943504,
0.010741809383034706, -0.012932957150042057, -0.002484489232301712, 0.0027324517723172903,
0.00897203665226698, -0.004793945699930191, 0.0043466403149068356, -0.0020047705620527267,
0.0021538722794502974, 0.021263208240270615, -0.0269679706543684, -0.024115590378642082,
-0.0025833500549197197, 0.030598275363445282, 0.002772968728095293, 0.01584368385374546,
0.006981851998716593, -0.0037113374564796686, -0.01273199450224638, -0.026280807331204414,
-0.02182071842253208, -0.049527715891599655, 0.02195037342607975, -0.008628454059362411,
-0.004353123251348734, 0.01064456906169653, -0.009698097594082355, -0.04094463959336281,
0.0238303504884243, 0.0034649954177439213, 0.032802388072013855, 0.0002048123424174264,
0.022507883608341217, 0.03770329803228378, -0.010346366092562675, 0.0028588641434907913,
0.026410460472106934, 0.019085025414824486, 0.008848865516483784, 0.015830717980861664,
-0.004469811450690031, 0.013808120042085648, -0.012031864374876022, -0.02099093608558178,
-0.006054827943444252, -0.045638103038072586, -0.024050762876868248, 0.014417491853237152,
0.01218744833022356, 0.0032413427252322435, -0.013302470557391644, -0.0003156257444061339,
0.006942956242710352, 0.00542600778862834, -0.0034358231350779533, 0.022067060694098473,
-0.013847015798091888, -0.026942040771245956, -0.0334506556391716, -0.01835896447300911,
-0.0021036313846707344, -0.001962633104994893, 0.012615305371582508, -0.0186053067445755,
0.01572699472308159, -0.02542509138584137, 0.019422125071287155, -0.013950739055871964,
-0.002110114088281989, 0.02052418142557144, -0.0014197081327438354, 0.0010485743405297399,
-0.004372571129351854, 0.0069299908354878426, -0.005105114541947842, -0.003756716148927808,
-0.015960371121764183, 0.025554746389389038, 0.003516856813803315, 0.005951105151325464,
0.009736993350088596, 0.043459922075271606, -0.008952588774263859, 0.021315069869160652,
-0.011318768374621868, -0.016375262290239334, -0.004560569301247597, -0.026656802743673325,
0.004842565860599279, 0.0004894427256658673, -0.023635871708393097, 0.007448605261743069,
-0.008965553715825081, 0.0026092808693647385, -0.01999260112643242, -0.007811635732650757,
0.012142069637775421, -0.01375625841319561, -0.02102983184158802, -0.006806819699704647,
0.015869613736867905, -0.0074032265692949295, -0.001892944099381566, -0.0037016132846474648,
-0.005322284530848265, 0.03293204307556152, -0.014430457726120949, 0.0418262854218483,
-0.012641236186027527, 0.018216345459222794, -0.028290439397096634, 0.02576219104230404,
0.008433973416686058, 0.013963703997433186, 0.030598275363445282, -0.01225227490067482,
0.012051312252879143, 0.0014553628861904144, -0.008822934702038765, 0.01100111659616232,
0.009860164485871792, -0.004388778004795313, -0.01685498282313347, 0.01091035921126604,
-0.00033223762875422835, -0.007850532419979572, -0.0006320617976598442, 0.002114976057782769,
-0.007532880175858736, 0.01710132323205471, 0.015610306523740292, -0.009036863222718239,
0.008200597018003464, 0.012174483388662338, 0.00447305291891098, 0.0186053067445755,
-0.019253576174378395, 0.010638087056577206, -0.02086128108203411, 0.022404160350561142,
0.010437123477458954, 0.0006920266896486282, -0.02128913812339306, -0.009296170435845852,
-0.004106780979782343, 0.044808320701122284, -0.013782189227640629, -0.003750233445316553,
-0.01181145291775465, 0.02764216996729374, 0.011960554867982864, -0.005043528974056244,
0.006155309733003378, -0.015584375709295273, 0.012433790601789951, -0.021600307896733284,
-0.04314875230193138, -0.01214855257421732, -0.024776823818683624, 0.039077628403902054,
0.016271540895104408, 0.000348039175150916, -0.01511762198060751, 0.0014926382573321462,
-0.04068533331155777, -0.0020290804095566273, -0.006904060021042824, 0.02099093608558178,

615
0.017049461603164673, -0.006981851998716593, 0.007364330347627401, 0.007416191976517439,
0.00766253424808383, 0.02153548039495945, -0.002995000686496496, 0.02157437615096569,
-0.011312286369502544, -0.009685131721198559, 0.00414891866967082, -0.009672166779637337,
-0.01308854203671217, -0.003380720503628254, -0.003168412484228611, 0.013769223354756832,
-0.012615305371582508, 0.007973702624440193, 0.001315985107794404, -0.006139102857559919,
-0.028212646022439003, 0.0004906582762487233, 0.0006340876570902765, 0.013289504684507847,
-0.010359331034123898, -0.02956104464828968, 0.0263456329703331, 0.02621597982943058,
0.005357939284294844, -0.022754225879907608, -0.009393410757184029, 0.007053161505609751,
-0.018086692318320274, -0.0012552099069580436, 0.003977127373218536, -0.010839049704372883,
-0.01584368385374546, 0.007753291632980108, 0.005951105151325464, 0.02478978969156742,
-0.00858955830335617, 0.007280055433511734, 0.013257091864943504, -0.0000065713156800484285,
0.007234676741063595, -0.00413919473066926, -0.01467679999768734, -0.018333034589886665,
-0.017658835276961327, -0.01681608520448208, 0.005108356010168791, -0.007630120497196913,
0.008479352109134197, -0.02771996334195137, 0.004567051772028208, -0.018579376861453056,
-0.003983610309660435, -0.0023110774345695972, 0.023065393790602684, 0.04281165450811386,
-0.015273206867277622, -0.006696613971143961, 0.002272181212902069, -0.008356180973351002,
-0.014508250169456005, -0.0066090975888073444, 0.00827838946133852, -0.016906842589378357,
0.003750233445316553, -0.008524730801582336, -0.0022802846506237984, -0.005156976170837879,
-0.009633270092308521, -0.035940006375312805, -0.004323950968682766, 0.027771824970841408,
0.19261354207992554, -0.014547145925462246, -0.006657717749476433, 0.013808120042085648,
-0.021340999752283096, 0.011869796551764011, 0.024115590378642082, 0.014080392196774483,
0.0023856281768530607, 0.0005133476224727929, -0.016206713393330574, 0.01723097823560238,
0.008012599311769009, 0.0019723570439964533, 0.006560477428138256, -0.040996503084897995,
-0.010657534934580326, 0.00037032339605502784, -0.027875546365976334, -0.011727177537977695,
-0.00768198212608695, -0.007299503777176142, -0.011202080175280571, -0.01939619518816471,
0.039622172713279724, -0.011668833903968334, -0.015830717980861664, 0.016919808462262154,
0.03207632899284363, 0.015960371121764183, -0.01093629002571106, -0.016842016950249672,
-0.008336733095347881, -0.013244125992059708, -0.011999450623989105, -0.020122256129980087,
-0.007422674912959337, -0.02501020021736622, -0.008505282923579216, -0.005526489112526178,
-0.0011830900330096483, 0.01773662678897381, 0.010709396563470364, -0.007267090491950512,
0.015999266877770424, 0.02604742906987667, -0.013315435498952866, 0.01621967926621437,
-0.02082238532602787, -0.01689387857913971, -0.0439007468521595, -0.03358031064271927,
0.000994281843304634, 0.03726247698068619, -0.02208002656698227, 0.000011990435268671717,
0.006949438713490963, 0.020433424040675163, 0.00515373470261693, -0.031298406422138214,
0.0031116888858377934, 0.015701064839959145, -0.02813485451042652, -0.007377295754849911,
0.007461570668965578, 0.03985555097460747, -0.010975186713039875, -0.025697365403175354,
0.0397258959710598, -0.026319703087210655, -0.0030403793789446354, -0.010067610070109367,
-0.002486109733581543, -0.0088812792673707, 0.0017438423819839954, -0.001923736883327365,
0.017827384173870087, 0.006220136769115925, 0.010255607776343822, 0.001199296792037785,
-0.01772366091609001, 0.035136155784130096, -0.0061066895723342896, -0.010735327377915382,
-0.010651051998138428, -0.026151152327656746, 0.006981851998716593, 0.006622062996029854,
-0.010048162192106247, -0.0009124379721470177, -0.00419105589389801, -0.019668467342853546,
0.00012296844215597957, 0.004894427489489317, 0.006852198392152786, 0.010437123477458954,
0.005908967927098274, 0.0038247844204306602, -0.008103356696665287, -0.006456754636019468,
-0.028653468936681747, 0.018216345459222794, 0.032205980271101, 0.00022101905778981745,
-0.029664767906069756, -0.008155218325555325, 0.03871459513902664, 0.03394334018230438,
0.005860347766429186, -0.013600673526525497, -0.016958704218268394, 0.006372479721903801,
0.0012543996563181281, -0.01911095716059208, -0.010437123477458954, 0.008356180973351002,
-0.012855164706707, -0.008472870104014874, 0.019370263442397118, -0.029457321390509605,
0.0034487885423004627, -0.015415825881063938, -0.00047364120837301016, 0.008887761272490025,
-0.0020015290938317776, 0.010501950047910213, -0.007500466890633106, -0.0017470837337896228,
0.01717911660671234, -0.024063728749752045, 0.026734594255685806, -0.024556411430239677,
0.0013573121977970004, -0.00010007645323639736, -0.00450546620413661, 0.007513432297855616,
0.027201347053050995, 0.003426099196076393, -0.022183749824762344, 0.002813485451042652,
0.008064460940659046, 0.002243009163066745, 0.009899060241878033, 0.010988151654601097,
-0.004790704697370529, -0.004638361278921366, 0.006025656126439571, -0.010605673305690289,
-0.01625857502222061, -0.020342666655778885, -0.016090024262666702, -0.026410460472106934,
0.0121226217597723, -0.009406376630067825, 0.0023759042378515005, -0.0273828636854887,
-0.015260240994393826, -0.004832841921597719, -0.0007702240254729986, 0.01856641098856926,
-0.031039098277688026, 0.0073967440985143185, 0.018721995875239372, -0.023026498034596443,
-0.008200597018003464, -0.023480286821722984, -0.16450461745262146, 0.025710329413414,
0.01681608520448208, -0.009023898281157017, 0.023428425192832947, -0.022754225879907608,
0.027616240084171295, 0.015234310179948807, -0.009224860928952694, 0.005166700109839439,
0.0008131718495860696, 0.0038507150020450354, -0.03153178095817566, -0.0026757284067571163,
0.003335341578349471, 0.00672254478558898, -0.030546413734555244, 0.036277107894420624,
0.017256908118724823, 0.0010526260593906045, 0.0053125605918467045, -0.02091314271092415,
-0.0016555157490074635, -0.0012454859679564834, 0.023467320948839188, 0.009497134014964104,
0.0046351198107004166, 0.005380628630518913, -0.021691065281629562, -0.013062611222267151,
-0.048023734241724014, -0.0008427490829490125, 0.017321735620498657, 0.021340999752283096,
0.011740143410861492, 0.012219862081110477, -0.012984818778932095, 0.007020748220384121,
-0.015130587853491306, -0.016193747520446777, 0.0071439193561673164, 0.03236156702041626,
0.024997234344482422, 0.01185683161020279, 0.010735327377915382, 0.04636416584253311,
0.014599007554352283, -0.009004450403153896, 0.019383229315280914, -0.009607339277863503,
-0.00414891866967082, -0.008336733095347881, -0.019888877868652344, -0.0005830365116707981,
0.02771996334195137, 0.005620488431304693, -0.00701426574960351, 0.013730327598750591,
0.014145219698548317, 0.011331734247505665, -0.021807754412293434, 0.022857949137687683,
0.01593444123864174, -0.0031343784648925066, 0.001282761339098215, -0.028627539053559303,
0.013354332186281681, 0.0034098925534635782, -0.014689764939248562, -0.004784221760928631,
-0.015208380296826363, -0.00796722061932087, -0.008693280629813671, -0.02311725541949272,
0.011629937216639519, -0.012323584407567978, -0.03243935853242874, 0.007643085904419422,

616
0.00766253424808383, 0.0028702090494334698, -0.017412493005394936, 0.026267841458320618,
0.010884428396821022, -0.03448788449168205, 0.004327192436903715, 0.018838683143258095,
-0.02228747308254242, -0.014702730812132359, -0.01020374707877636, -0.027694031596183777,
0.006122896447777748, -0.004252641461789608, -0.012686614878475666, -0.008829417638480663,
0.03319134935736656, 0.01789221167564392, 0.021250242367386818, -0.006683648563921452,
0.009412859566509724, -0.02294870652258396, 0.0009659201023168862, -0.008336733095347881,
-0.019603639841079712, -0.012116138823330402, 0.009775889106094837, 0.03993334248661995,
0.009892578236758709, 0.017153184860944748, 0.015545479021966457, -0.01288109552115202,
-0.020433424040675163, 0.013652535155415535, 0.022170783951878548, 0.024102624505758286,
-0.003623821074143052, 0.03230970352888107, 0.01852751523256302, -0.03132433444261551,
-0.017218012362718582, 0.011279872618615627, 0.052250444889068604, 0.005604281555861235,
0.010722361505031586, 0.006155309733003378, -0.016362298280000687, 0.0038020950742065907,
-0.1179330125451088, 0.0006563718779943883, 0.006923507899045944, 0.010300987400114536,
0.010313952341675758, 0.016128921881318092, 0.010683465749025345, 0.032050397247076035,
-0.0040808506309986115, 0.011318768374621868, -0.021016865968704224, -0.05787741392850876,
-0.018786821514368057, -0.03155771270394325, 0.010994634591042995, -0.00672254478558898,
0.010605673305690289, -0.029042430222034454, -0.018294138833880424, 0.01147435326129198,
-0.0367957204580307, 0.016906842589378357, 0.010385261848568916, -0.006301170215010643,
-0.014547145925462246, -0.004356364719569683, -0.03181701898574829, -0.0031505851075053215,
0.004936564713716507, 0.012952405028045177, 0.010398227721452713, -0.018164483830332756,
0.01794407330453396, -0.020977970212697983, -0.0003221084189135581, 0.007617155089974403,
0.0036173383705317974, -0.01026209071278572, 0.037651438266038895, -0.015662167221307755,
0.011156701482832432, -0.014599007554352283, -0.00417160801589489, -0.03412485495209694,
0.010022231377661228, 0.004084091633558273, -0.005659384187310934, -0.0012973473640158772,
0.02266346849501133, -0.015195414423942566, -0.021211346611380577, -0.01056029461324215,
-0.008284871466457844, -0.001761669758707285, 0.005711245816200972, 0.009205413050949574,
0.008997967466711998, -0.021470654755830765, -0.02379145473241806, 0.013872946612536907,
-0.004845807328820229, -0.016193747520446777, -0.003912300802767277, 0.00304200011305511,
-0.002687073079869151, -0.007617155089974403, 0.003983610309660435, -0.028575677424669266,
0.012939440086483955, -0.004434156697243452, -0.020381562411785126, 0.0030338966753333807,
-0.022131888195872307, 0.0012187449028715491, -0.020251909270882607, 0.003640027716755867,
-0.018449721857905388, 0.008557144552469254, 0.029042430222034454, -0.00807094294577837,
-0.025502884760499, -0.025852948427200317, 0.03057234361767769, -0.032205980271101, 0.011779039166867733,
0.007481019012629986, -0.010112988762557507, 0.017114289104938507, 0.011623455211520195,
-0.01751621626317501, -0.009341550059616566, -0.005238009616732597, -0.013561777770519257,
-0.005001391749829054, -0.014702730812132359, 0.008609006181359291, -0.009568443521857262,
0.00857011042535305, 0.0027049004565924406, 0.009101689793169498, -0.018294138833880424,
-0.0010202126577496529, -0.07286538183689117, 0.01214855257421732, -0.011403043754398823,
0.0032656528055667877, 0.0046221548691391945, -0.017568077892065048, 0.019136887043714523,
-0.022546779364347458, -0.0037826469633728266, -0.004913875367492437, -0.01572699472308159,
0.034306369721889496, 0.013509916141629219, -0.014158184640109539, -0.011014082469046116,
-0.010787188075482845, 0.005027322564274073, 0.002033942611888051, 0.017218012362718582,
0.02478978969156742, -0.010430640541017056, 0.0027437966782599688, 0.00802556425333023,
-0.01406742725521326, 0.0038442325312644243, 0.0035038914065808058, -0.003345065750181675,
0.02386924810707569, -0.00384747376665473, -0.012550478801131248, 0.0008500420954078436,
-0.012025381438434124, 0.015662167221307755, 0.019914809614419937, 0.029846282675862312,
-0.026656802743673325, -0.006203929893672466, -0.011033530347049236, 0.04947585612535477,
0.03575849160552025, -0.007960737682878971, -0.024984268471598625, 0.013548812828958035,
-0.02484164945781231, -0.01406742725521326, 0.011681798845529556, -0.022806087508797646,
0.0018378413515165448, 0.030961304903030396, 0.007928323931992054, 0.03251715004444122,
0.010884428396821022, 0.002808623481541872, -0.008466387167572975, 0.02379145473241806,
-0.019162818789482117, 0.05357291176915169, -0.0044049848802387714, -0.0040581608191132545,
-0.013471020385622978, 0.020718662068247795, 0.00829783733934164, -0.01685498282313347,
-0.004100298509001732, -0.000024132808903232217, -0.014482319355010986, -0.03404706344008446,
0.007876462303102016, 0.0011855211341753602, -0.0405556783080101, -0.01225227490067482,
-0.006268756929785013, 0.015312102623283863, 0.015299137681722641, 0.013548812828958035,
0.014560110867023468, 0.01471569575369358, -0.002121458761394024, -0.01100111659616232,
0.00015933225222397596, 0.01965550146996975, -0.003769681556150317, -0.02826450765132904,
0.01413225382566452, 0.029198015108704567, 0.048568278551101685, -0.005461662542074919,
0.014274872839450836, 0.0036529931239783764, 0.005160217639058828, 0.0010145402047783136,
0.017243942245841026, -0.006793854292482138, 0.0005344163510017097, 0.01341915875673294,
0.019538814201951027, -0.003721061395481229, -0.01056029461324215, 0.005033805035054684,
0.03562884032726288, -0.004832841921597719, 0.012997783720493317, -0.006443789228796959,
-0.015817752107977867, -0.015947405248880386, 0.023208012804389, -0.019590675830841064,
-0.0267605260014534, -0.008045012131333351, 0.0018005658639594913, 0.022935740649700165,
-0.006323859561234713, -0.0033677550964057446, 0.02027783915400505, -0.020433424040675163,
0.01689387857913971, -0.027019832283258438, -0.040166717022657394, -0.0001858707400970161,
0.009393410757184029, 0.0010169713059440255, 0.00733839999884367, 0.00923782680183649,
-0.011007599532604218, 0.022339332848787308, 0.01406742725521326, 0.01659567467868328,
-0.021794788539409637, -0.019085025414824486, -0.008245975710451603, 0.016323402523994446,
-0.0023759042378515005, -0.0077208783477544785, -0.001962633104994893, 0.0036108556669205427,
-0.0008832658641040325, -0.0267605260014534, 0.02873126231133938, -0.022935740649700165,
0.02412855438888073, -0.005513523705303669, 0.00507918419316411, 0.005951105151325464,
0.005173183046281338, 0.005001391749829054, -0.0177625585347414, 0.015130587853491306,
-0.01999260112643242, -0.0192406103014946, 0.02140582725405693, -0.007377295754849911,
0.02927580662071705, -0.015104657039046288, -0.01216800045222044, 0.0034552712459117174,
-0.003481202060356736, 0.013561777770519257, 0.0022592158056795597, -0.01183090079575777,
0.013665501028299332, 0.017075393348932266, 0.008511765860021114, -0.007409709505736828,
-0.0014261907199397683, -0.003058206755667925, 0.0031424816697835922, 0.0021846650633960962,
-0.000989419873803854, -0.022144854068756104, 0.0036270625423640013, -0.006233102176338434,

617
-0.016504917293787003, 0.003999816719442606, 0.017931107431650162, -0.025956671684980392,
0.021626237779855728, 0.002192768268287182, -0.001260882243514061, 0.018721995875239372,
-0.023493250831961632, 0.030598275363445282, -0.011616972275078297, -0.019331367686390877,
0.01505279541015625, 0.01505279541015625, 0.004524914547801018, 0.0019042887724936008,
-0.014962038025259972]

Words that are semantically similar are often represented by vectors that are close to each other in this
vector space. This allows for mathematical operations like addition and subtraction to carry semantic
meaning. For example, the vector representation of "king" minus "man" plus "woman" should be close to
the vector representation of "queen." In other words, vector embeddings are a numerical representation of
a particular data object.

A vector index allows you to retrieve a neighborhood of nodes or relationships based on the similarity
between the embedding properties of those nodes or relationships and the ones specified in the query.

Create vector indexes


A vector index is a single-label, single-property index for nodes or a single-relationship-type, single-
property index for relationships. It can be used to index nodes or relationships by LIST<INTEGER | FLOAT>
properties valid to the dimensions and vector similarity function of the index. Note that the available vector
index providers (vector-2.0 (default) and vector-1.0) support different index schemas, property value
types, and vector dimensions. For more information, see Vector index providers for compatibility.

A vector index is created by using the CREATE VECTOR INDEX command. It is recommended to give the
index a name when it is created. If no name is given when created, a random name will be assigned. As of
Neo4j 5.16, the index name can also be given as a parameter: CREATE VECTOR INDEX $name ….

The index name must be unique among both indexes and constraints.
 A newly created index is not immediately available but is created in the background.

 Creating indexes requires the CREATE INDEX privilege.

Create vector index for Movie nodes on the embedding property New

CREATE VECTOR INDEX moviePlots IF NOT EXISTS ①


FOR (m:Movie)
ON m.embedding
OPTIONS { indexConfig: {
`vector.dimensions`: 1536,
`vector.similarity_function`: 'cosine'
}} ②

① The CREATE VECTOR INDEX command is optionally idempotent. This means that its default behavior is to
throw an error if an attempt is made to create the same index twice. With IF NOT EXISTS, no error is
thrown and nothing happens should an index with the same name, schema or both already exist. It
may still throw an error should a constraint with the same name exist. As of Neo4j 5.17, an
informational notification is returned when nothing happens, showing the existing index which blocks
the creation.

② Prior to Neo4j 5.23, the OPTIONS map was mandatory since a vector index could not be created without
setting the vector dimensions and similarity function. Since Neo4j 5.23, both can be omitted. To read
more about the available configuration settings, see Configuration settings. In this example, the vector
dimension is explicitly set to 1536 and the vector similarity function to 'cosine', which is generally the

618
preferred similarity function for text embeddings. To read more about the available similarity functions,
see Cosine and Euclidean similarity functions.

Prior to Neo4j 5.15, node vector indexes were created using the
 db.index.vector.createNodeIndex procedure.

You can also create a vector index for relationships with a particular type on a given property using the
following syntax:

Create a vector index for a relationship type on a single property New

CREATE VECTOR INDEX name IF NOT EXISTS


FOR ()-[r:REL_TYPE]-() ON (r.embedding)
OPTIONS { indexConfig: {
`vector.dimensions`: $dimension,
`vector.similarity_function`: $similarityFunction
}}

Configuration settings

For more information about the values accepted by different index providers, see Vector index providers
for compatibility.

vector.dimensions

The dimensions of the vectors to be indexed. For more information, see Vectors and embeddings in Neo4j.
This setting can be omitted, and any LIST<INTEGER | FLOAT> can be indexed and queried, separated by
their dimensions, though only vectors of the same dimension can be compared. Setting this value adds
additional checks that ensure only vectors with the configured dimensions are indexed, and querying the
index with a vector of a different dimensions returns an error.

 It is recommended to provide dimensions when creating a vector index.

Accepted values
INTEGER between 1 and 4096 inclusively.

Default value
None. The setting was mandatory prior to Neo4j 5.23.

vector.similarity_function

The name of the similarity function used to assess the similarity of two vectors. To read more about the
available similarity functions, see Cosine and Euclidean similarity functions.

Accepted values
STRING: 'cosine', 'euclidean'.

619
Default value
'cosine'. The setting was mandatory prior to Neo4j 5.23.

vector.quantization.enabled Label—new 5.23

Quantization is a technique to reduce the size of vector representations. Enabling quantization can
accelerate search performance but can slightly decrease accuracy. It is recommended to enable
quantization on machines with limited memory. Vector indexes created prior to Neo4j 5.23 have this
setting effectively set to false.

Accepted values
BOOLEAN: true, false.

Default value
true

Advanced configuration settings

vector.hnsw.m Label—new 5.23

The M parameter controls the maximum number of connections each node has in the HNSW (Hierarchical
Navigable Small Worlds) graph. Increasing this value may lead to greater accuracy at the expense of
increased index population and update times, especially for vectors with high dimensionality. Vector
indexes created prior to Neo4j 5.23 have this setting effectively set to 16.

Accepted values
INTEGER between 1 and 512 inclusively.

Default value
16

vector.hnsw.ef_construction Label—new 5.23

The number of nearest neighbors tracked during the insertion of vectors into the HNSW graph. Increasing
this value increases the quality of the index, and may lead to greater accuracy (with diminishing returns) at
the expense of increased index population and update times. Vector indexes created prior to Neo4j 5.23
have this setting effectively set to 100.

Accepted values
INTEGER between 1 and 3200 inclusively.

Default value
100

620
Query vector indexes
To query a node vector index, use the db.index.vector.queryNodes procedure.

Signature for db.index.vector.queryNodes

db.index.vector.queryNodes(indexName :: STRING, numberOfNearestNeighbours :: INTEGER, query :: ANY) ::


(node :: NODE, score :: FLOAT)

• The indexName refers to the unique name of the vector index to query.

• The numberOfNearestNeighbours refers to the number of nearest neighbors to return.

• The query vector refers to the LIST<INTEGER | FLOAT> in which to search for the neighborhood.

The procedure returns the neighborhood of nodes with their respective similarity scores, ordered by those
scores. The scores are bounded between 0 and 1, where the closer to 1 the score is, the more similar the
indexed vector is to the query vector.

Find the 5 movies with the most similar plot to The Godfather

MATCH (m:Movie {title: 'Godfather, The'})


CALL db.index.vector.queryNodes('moviePlots', 5, m.embedding)
YIELD node AS movie, score
RETURN movie.title AS title, movie.plot AS plot, score

Result

+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
---------------------------------------------------------+
| title | plot
| score |
+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
---------------------------------------------------------+
| "Godfather, The" | "The aging patriarch of an organized crime dynasty transfers control of his
clandestine empire to his reluctant son."
| 1.0 |
| "Godfather: Part III, The" | "In the midst of trying to legitimize his business dealings in New York and
Italy in 1979, aging Mafia don Michael Corleone seeks to avow for his sins while taking a young protégé
under his wing." | 0.9648237228393555 |
| "Godfather: Part II, The" | "The early life and career of Vito Corleone in 1920s New York is portrayed
while his son, Michael, expands and tightens his grip on his crime syndicate stretching from Lake Tahoe,
Nevada to pre-revolution 1958 Cuba." | 0.9547788500785828 |
| "Goodfellas" | "Henry Hill and his friends work their way up through the mob hierarchy."
| 0.9300689697265625 |
| "Scarface" | "An ambitious and near insanely violent gangster climbs the ladder of
success in the mob, but his weaknesses prove to be his downfall."
| 0.9367183446884155 |
+---------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
---------------------------------------------------------+

Note that all movies returned have a plot centred around criminal family organizations. The score results
are returned in descending order, where the best matching result entry is put first (in this case, The
Godfather has a similarity score of 1.0, which is to be expected as the index was queried with this specific
property). If the query vector itself is not wanted, adding the predicate WHERE score < 1 removes identical
vectors.

To query a relationship vector index, use the db.index.vector.queryRelationships procedure.

621
Signature for db.index.vector.queryRelationships New

db.index.vector.queryRelationships(indexName :: STRING, numberOfNearestNeighbours :: INTEGER, query ::


ANY) :: (relationship :: RELATIONSHIP, score :: FLOAT)

db.index.vector.queryRelationships has the same argument descriptions as


db.index.vector.queryNodes.

Use Vector functions to compute the similarity score between two specific vector pairs
 without using a vector index.

Performance suggestions
Vector indexes can take advantage of the incubated Java 20 Vector API for noticeable speed
improvements. If you are using a compatible version of Java, you can add the following setting to your
configuration settings:

Configuration settings

server.jvm.additional=--add-modules jdk.incubator.vector

Show vector indexes


To list all vector indexes in a database, use the SHOW VECTOR INDEXES command. This is the same SHOW
command as for other indexes, with the index type filtering on VECTOR.

 Listing indexes requires the SHOW INDEX privilege.

Example 311. Show all vector indexes

Show vector indexes with the default return columns

SHOW VECTOR INDEXES

Result

+----------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------+
| id | name | state | populationPercent | type | entityType | labelsOrTypes |
properties | indexProvider | owningConstraint | lastRead | readCount |
+----------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------+
| 2 | "moviePlots" | "ONLINE" | 100.0 | "VECTOR" | "NODE" | ["Movie"] |
["embedding"] | "vector-2.0" | NULL | 2024-05-07T09:19:09.225Z | 47 |
+----------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------+

For a full description of all return columns, see Search-performance indexes → Result columns for listing
indexes.

622
Example 312. Show vector indexes with full or filtered details

To return full vector index details, use YIELD *.

Show all vector indexes with all return columns

SHOW VECTOR INDEXES YIELD *

Result

+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-------------------------------------------------------------+
| id | name | state | populationPercent | type | entityType | labelsOrTypes |
properties | indexProvider | owningConstraint| lastRead | readCount |
trackedSince | options
| failureMessage | createStatement
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-------------------------------------------------------------+
| 2 | "moviePlots"| "ONLINE" | 100.0 | "VECTOR" | "NODE" | ["Movie"] |
["embedding"] | "vector-2.0" | NULL | 2024-05-07T09:19:09.225Z | 47 | 2024-
05-07T08:26:19.072Z | {indexConfig: {indexConfig: {`vector.dimensions`: 1536, `vector.hnsw.m`: 16,
`vector.quantization.enabled`: TRUE, `vector.similarity_function`: "COSINE",
`vector.hnsw.ef_construction`: 100}, indexProvider: "vector-2.0"}, indexProvider: "vector-2.0"} | ""
| "CREATE VECTOR INDEX `moviePlots` FOR (n:`Movie`) ON (n.`embedding`) OPTIONS {indexConfig:
{`vector.dimensions`: 1536,`vector.hnsw.ef_construction`: 100,`vector.hnsw.m`:
16,`vector.quantization.enabled`: true,`vector.similarity_function`: 'COSINE'}, indexProvider:
'vector-2.0'}" |
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-------------------------------------------------------------+

To return only specific details, specify the desired column name(s) after the YIELD clause.

Show all vector indexes with filtered return columns

SHOW VECTOR INDEXES YIELD name, type, entityType, labelsOrTypes, properties

Result

+----------------------------------------------------------------------+
| name | type | entityType | labelsOrTypes | properties |
+----------------------------------------------------------------------+
| "moviePlots" | "VECTOR" | "NODE" | ["Movie"] | ["embedding"] |
+----------------------------------------------------------------------+

623
Drop vector indexes
A vector index is dropped by using the same command as for other indexes, DROP INDEX. As of Neo4j 5.16,
the index name can also be given as a parameter when dropping an index: DROP INDEX $name.

 Dropping indexes requires the DROP INDEX privilege.

Drop a vector index

DROP INDEX moviePlots

Vector index providers for compatibility


As of Neo4j 5.18, the default and preferred vector index provider is vector-2.0. Previously created vector-
1.0 indexes will continue to function. New indexes can still be created with the vector-1.0 provider if it is
specified in the OPTIONS map.

Learn more about vector index provider differences

Supported vector-1.0 vector-2.0

Index schema Single-label, single-property index for Single-label, single-property index for
nodes. nodes.

No relationship support. Single-type, single-property index for


relationships.

Indexed property value type LIST<FLOAT> LIST<INTEGER | FLOAT>

Indexed vector dimension INTEGER between 1 and 2048 inclusive. INTEGER between 1 and 4096 inclusive.

Cosine similarity vector validity All vector components can be All vector components can be
represented finitely in IEEE 754 single represented finitely in IEEE 754
precision. double precision.

Its -norm is non-zero and can be Its -norm is non-zero and can be
represented finitely in IEEE 754 single represented finitely in IEEE 754
precision. double precision.

The ratio of each vector component


with its -norm can be represented
finitely in IEEE 754 single precision.

Cosine and Euclidean similarity functions


The choice of similarity function affects which indexed vectors are considered similar, and which are valid.
The semantic meaning of the vector may itself dictate which similarity function to choose. Refer to the

624
documentation for the particular vector embedding model you are using, as it may suggest a preference
for certain similarity functions. Otherwise, being able to differentiate between the various similarity
functions can assist in making a more informed decision.

Similarity functions

Name Case insensitive argument Key similarity feature

Cosine "cosine" angle

Euclidean "euclidean" distance

For -normalized vectors (unit vectors), cosine and Euclidean similarity functions produce the same
similarity ordering.

Learn more about the cosine similarity function

Cosine similarity is used when the angle between the vectors is what determines how similar two
vectors are.

A valid vector for a cosine vector index is when:

[12]
• All vector components can be represented finitely in IEEE 754 double precision.

• Its -norm is non-zero and can be represented finitely in IEEE 754 double precision.

• The ratio of each vector component with its -norm can be represented finitely in IEEE 754
single precision.

Cosine similarity interprets the vectors in Cartesian coordinates. The measure is related to the angle
between the two vectors. However, an angle can be described in many units, sign conventions, and
periods. The trigonometric cosine of this angle is both agnostic to the aforementioned angle
conventions and bounded. Cosine similarity rebounds the trigonometric cosine.

In the above equation the trigonometric cosine is given by the scalar product of the two unit vectors.

625
Learn more about the Euclidean similarity function

Euclidean similarity is useful when the distance between the vectors is what determines how similar
two vectors are.

A valid vector for a Euclidean vector index is when all vector components can be represented finitely
in IEEE 754 single precision.

Euclidean interprets the vectors in Cartesian coordinates. The measure is related to the Euclidean
distance, i.e., how far two points are from one another. However, that distance is unbounded and less
useful as a similarity score. Euclidean similarity bounds the square of the Euclidean distance.

Vector index procedures

Usage Procedure Description

Create node vector db.index.vector.createNodeIndex Create a vector index for the specified
index. label and property with the given vector
dimension using the given similarity
function. Replaced by the CREATE VECTOR
INDEX command.

Use node vector index. db.index.vector.queryNodes Query the given node vector index.
Returns the requested number of
approximate nearest neighbor nodes and
their similarity score, ordered by score.

Use relationship vector db.index.vector.queryRelationships Query the given relationship vector


index. index. Returns the requested number of
approximate nearest neighbor
relationships and their similarity score,
ordered by score. New

Set node vector db.create.setNodeVectorProperty Update a given node property with the
property. given vector in a more space-efficient
way than directly using SET. Replaces
db.create.setVectorProperty. Beta
New

Set node vector db.create.setVectorProperty Replaced by


property. db.create.setNodeVectorProperty.
Deprecated Beta

626
Usage Procedure Description

Set relationship vector db.create.setRelationshipVectorPrope Update a given relationship property


property. rty with the given vector in a more space-
efficient way than directly using SET.
Beta New

Limitations and known issues


As of Neo4j 5.13, the vector index is no longer a beta feature. It does, however, still contain some
limitations and known issues.

Limitations

• The query is an approximate nearest neighbor search. The requested k nearest neighbors may
not be the exact k nearest, but close within the same wider neighborhood.

• For large requested nearest neighbors, k, close to the total number of indexed vectors, the search
may retrieve fewer than k results.

• Only one vector index can be over a schema. For example, you cannot have one Euclidean and
one cosine vector index on the same label-property key pair.

• Changes made within the same transaction are not visible to the index.

627
Known issues

The following table lists the known issues and, if fixed, the version in which they were fixed:

Known issues Fixed in

The creation of a vector index using the legacy procedure


db.index.vector.createNodeIndex may fail with an error in Neo4j 5.18 and later if
the database was last written to with a version prior to Neo4j 5.11, and the legacy
procedure is the first write operation used on the newer version. In Neo4j 5.20, the
error was clarified.

Using the CREATE VECTOR INDEX command instead avoids this


issue. If the use of the procedure is unavoidable, performing any
 other write operation to the database on the newer binary before
using the procedure will avoid the issue

Procedure signatures from SHOW PROCEDURES will render the vector arguments with
a type of ANY rather than the semantically correct type of LIST<INTEGER | FLOAT>.

 The types are still enforced as LIST<INTEGER | FLOAT>.

No provided settings or options for tuning the index. Neo4j 5.23

Only node vector indexes are supported. Neo4j 5.18

Vector indexes cannot be assigned autogenerated names. Neo4j 5.15

There is no Cypher syntax for creating a vector index. Neo4j 5.15

Use the procedure db.index.vector.createNodeIndex to create


the a vector index. Procedure signature:

 db.index.vector.createNodeIndex(indexName :: STRING, label ::


STRING, propertyKey :: STRING, vectorDimension :: INTEGER,
vectorSimilarityFunction :: STRING)

The standard index type filtering for SHOW INDEXES command is missing. Neo4j 5.15

Filtering on vector indexes can be done with the WHERE clause as


well:

 SHOW INDEXES
WHERE type = 'VECTOR'

628
Known issues Fixed in

Vector indexes may incorrectly reject valid queries in a cluster setting. This is Neo4j 5.14
caused by an issue in the handling of index capabilities on followers.

Because index capabilities will be correctly configured on a


restart, this issue can be worked around by rolling the cluster
after vector index creation.

For more information about clustering in Neo4j, see the
Operations Manual → Clustering.

Querying for a single approximate nearest neighbor from an index would fail a Neo4j 5.13
validation check. Passing a null value would also provide an unhelpful exception.

Vector index queries throw an exception if the transaction state contains changes. Neo4j 5.13
This means that writes may only take place after the last vector index query in a
transaction.

To work around this issue if you need to run multiple vector index
queries and make changes based on the results, you can run the
 queries in a CALL { ... } IN TRANSACTIONS clause to isolate
them from the outer transaction’s state.

SHOW PROCEDURES does not show the vector index procedures: Neo4j 5.12

• db.create.setVectorProperty

• db.index.vector.createNodeIndex

• db.index.vector.queryNodes

 The procedures are still usable, just not visible.

Passing null as an argument to some of the procedure parameters can generate a Neo4j 5.12
confusing exception.

The creation of the vector index skipped the check to limit the dimension to 2048. Neo4j 5.12

Vector indexes configured with a dimension greater than 2048 in


 Neo4j 5.11 should continue to work after the limitation is applied.

629
Known issues Fixed in

The validation for <a anchor="similarity-functions">cosine similarity</a> verifies that the Neo4j 5.12
vector&#8217;s <img src="/opt/teamcity-agent/work/92fdaa4456d7b4b9/neo-
technology/mma/cypherManual/build/5/mono/modules/ROOT/images/l2.svg" format="svg"
alt="l2">-norm can be represented finitely in IEEE 754 <strong>double</strong> precision, rather
than in <em>single</em> precision. This can lead to certain large component vectors being
incorrectly indexed, and return a similarity score of <code>&plusmn;0.0</code>.

<a href="https://neo4j.com/docs/pdf/neo4j-operations-manual-5.pdf#similarity- Neo4j 5.12


functions">cosine</a> vector index. The <img src="/opt/teamcity-
agent/work/92fdaa4456d7b4b9/neo-
technology/mma/cypherManual/build/5/mono/modules/ROOT/images/l2.svg"
format="svg" alt="l2">-norm validation only considers the last component of the
vector. If that component is <code>&plusmn;0.0</code>, an otherwise valid query
vector will be thrown as invalid. This can also result in some invalid vectors being
used to query, and return a similarity score of <code>&plusmn;0.0</code>.

For -normalized vectors (unit vectors), thus having unit length


, Euclidean and cosine similarity functions produce the
 same similarity ordering. It is recommended to normalize your
vectors (if needed), and use a Euclidean vector index.

The vector index createStatement field from SHOW INDEXES does not correctly Neo4j 5.12
escape single quotes in index names, labels, and property keys.

Copying a database store with a vector index does not log the recreation Neo4j 5.12
command, and instead logs an error:

ERROR: [StoreCopy] Unable to format statement for index 'index-name'

Due to an:

java.lang.IllegalArgumentException: Did not recognize index type VECTOR

If a store copy is required, make a note of the information in the


createStatement column returned from the SHOW INDEX
command. For example:

SHOW INDEXES YIELD type, createStatement
WHERE type = 'VECTOR'
RETURN createStatement

630
Known issues Fixed in

Some of the protections preventing the use of new features during a database Neo4j 5.12
rolling upgrade are missing. This can result in a transaction to create a vector index
on a cluster member running Neo4j 5.11 and distributing it to other cluster
members running an older Neo4j version. The older Neo4j versions will fail to
understand the transaction.

Ensure that all cluster members have been updated to use Neo4j
5.11 (or a newer version) before calling dbms.upgrade() on the
 system database. Once committed, vector indexes can be safely
created on the cluster.

Syntax
This page contains the syntax for creating, listing, and dropping the indexes available in Neo4j. It also
contains the signatures for the procedures necessary to call in order to use full-text and vector indexes.

More details about the syntax can be found in the Operations Manual → Cypher syntax for administration
commands.

CREATE INDEX
The general structure of the CREATE INDEX command is:

CREATE [index_type] INDEX [index_name] [IF NOT EXISTS]


FOR {node_pattern | relationship_pattern}
ON property_or_token_lookup_pattern
[OPTIONS “{“ option: value[, …] “}”]

The CREATE … INDEX … command is optionally idempotent. This means that its default behavior is to throw
an error if an attempt is made to create an index with the same name twice. With IF NOT EXISTS, no error
is thrown and nothing happens should an index with the same name or same schema and index type
already exist (it may still throw an error if conflicting constraints exist, such as constraints with the same
name or with the same schema and backing index). As of Neo4j 5.17, an informational notification is
instead returned showing the existing index which blocks the creation.

The index name must be unique among both indexes and constraints. A random name will be assigned if
no name is explicitly given when an index is created.

Index providers and configuration settings can be specified using the OPTIONS clause. However, not all
indexes have available configuration settings or multiple providers. In those cases, nothing needs to be
specified and the OPTIONS map should be omitted from the query.

 Creating an index requires the CREATE INDEX privilege.

631
Range indexes
Range indexes have only one index provider, range-1.0, and no supported index configuration. Since the
index provider will be assigned by default, the OPTIONS map has been omitted from the syntax below.

Create a range index for a node label, either on a single property or composite

CREATE [RANGE] INDEX [index_name] [IF NOT EXISTS]


FOR (n:LabelName)
ON (n.propertyName_1[,
n.propertyName_2,
...
n.propertyName_n])

Create a range index for a relationship type, either on a single property or composite

CREATE [RANGE] INDEX [index_name] [IF NOT EXISTS]


FOR ()-”[“r:TYPE_NAME”]”-()
ON (r.propertyName_1[,
r.propertyName_2,
...
r.propertyName_n])

For more information, see Create, show, and delete indexes → Create a range index.

Text indexes
Create a text index for a node label on a single property

CREATE TEXT INDEX [index_name] [IF NOT EXISTS]


FOR (n:LabelName)
ON (n.propertyName_1)
[OPTIONS “{“ option: value[, …] “}”]

Create a text index for a relationship type on a single property

CREATE TEXT INDEX [index_name] [IF NOT EXISTS]


FOR ()-”[“r:TYPE_NAME”]”-()
ON (r.propertyName_1)
[OPTIONS “{“ option: value[, …] “}”]

Text indexes have no supported index configuration and, as of Neo4j 5.1, they have two index providers
available, text-2.0 (default) and text-1.0 (deprecated).

 It is not possible to create composite text indexes on multiple properties.

For more information, see Create, show, and delete indexes → Create a text index.

Point indexes
Create a point index for a node label on a single property

CREATE POINT INDEX [index_name] [IF NOT EXISTS]


FOR (n:LabelName)
ON (n.propertyName_1)
[OPTIONS “{“ option: value[, …] “}”]

632
Create a point index for a relationship type on a single property

CREATE POINT INDEX [index_name] [IF NOT EXISTS]


FOR ()-”[“r:TYPE_NAME”]”-()
ON (r.propertyName_1)
[OPTIONS “{“ option: value[, …] “}”]

Point indexes have only one index provider available, point-1.0. The following settings can be specified
for point indexes:

• spatial.cartesian.min

• spatial.cartesian.max

• spatial.cartesian-3d.min

• spatial.cartesian-3d.max

• spatial.wgs-84.min

• spatial.wgs-84.max

• spatial.wgs-84-3d.min

• spatial.wgs-84-3d.max

 It is not possible to create composite point indexes on multiple properties.

For more information, see Create, show, and delete indexes → Create a point index.

Token lookup indexes


Token lookup indexes have only one index provider, token-lookup-1.0, and no supported index
configuration. Since the index provider will be assigned by default, the OPTIONS map has been omitted from
the syntax below.

Create a node label lookup index

CREATE LOOKUP INDEX [index name] [IF NOT EXISTS]


FOR (n)
ON EACH labels(n)

Create a relationship type lookup index

CREATE LOOKUP INDEX [index name] [IF NOT EXISTS]


FOR ()-”[“r”]”-()
ON [EACH] type(r)

Two token lookup indexes are present by default when creating a Neo4j database, and only one node label
lookup index and one relationship type lookup index can exist at the same time.

For more information, see Create, show, and delete indexes → Create a token lookup index.

Full-text indexes

633
Create a full-text index for one or more node labels, either on a single property or multiple properties

CREATE FULLTEXT INDEX [index_name] [IF NOT EXISTS]


FOR (n:LabelName[“|” …])
ON EACH “[“ n.propertyName[, ...] “]”
[OPTIONS “{“ option: value[, …] “}”]

Create a full-text index for one or more relationship types, either on a single property or multiple properties

CREATE FULLTEXT INDEX [index_name] [IF NOT EXISTS]


FOR ()-”[“r:TYPE_NAME[“|” ...]”]”-()
ON EACH “[“ r.propertyName[, ...] “]”
[OPTIONS “{“ option: value[, …] “}”]

Full-text indexes have only one index provider available, fulltext-1.0. The following settings can be
specified for full-text indexes:

• fulltext.analyzer - specifies what analyzer to use (the db.index.fulltext.listAvailableAnalyzers


procedure lists what analyzers are available).

• fulltext.eventually_consistent - specifies whether a full-text index is eventually consistent. If set to


true, it will ensure that updates from committing transactions are applied in a background thread.

For more information, see Full-text indexes - Create full-text indexes.

Vector indexes
Create a vector index for a node label on a single property New

CREATE VECTOR INDEX [index_name] [IF NOT EXISTS]


FOR (n:LabelName)
ON (n.propertyName)
[OPTIONS “{“ option: value[, …] “}”]

Create a vector index for a relationship type on a single property New

CREATE VECTOR INDEX [index_name] [IF NOT EXISTS]


FOR ()-”[“r:TYPE_NAME”]”-()
ON (r.propertyName)
[OPTIONS “{“ option: value[, …] “}”]

As of Neo4j 5.18, vector indexes have two vector index providers available, vector-2.0 (default) and
vector-1.0. For more information, see Vector index providers for compatibility.

For a full list of all vector index settings, see Vector index configuration settings. Note that the OPTIONS
clause was mandatory prior to Neo4j 5.23 because it was necessary to configure the vector.dimensions
and vector.similarity_function settings when creating a vector index.

OPTIONS {
indexConfig: {
`vector.dimensions`: $dimension,
`vector.similarity_function`: $similarityFunction
}
}

 It is not possible to create composite vector indexes on multiple properties.

634
For more information, see Vector indexes - Create and configure vector indexes.

SHOW INDEX

 Listing indexes requires the SHOW INDEX privilege.

List indexes in the database (either all or filtered on index type)

SHOW [ALL | FULLTEXT | LOOKUP | POINT | RANGE | TEXT | VECTOR] INDEX[ES]


[YIELD { * | field[,...] } [ORDER BY field[,...]] [SKIP n] [LIMIT n]]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

When using the RETURN clause, the YIELD clause is mandatory.

For more information, see Create, show, and delete indexes → SHOW INDEXES.

Query semantic indexes


Neo4j’s semantic indexes are not used automatically in Cypher queries. To use them, specific procedures
must be called. Their signatures can be seen below.

Full-text indexes
Query full-text index on nodes: db.index.fulltext.queryNodes

CALL db.index.fulltext.queryNodes(indexName :: STRING, queryString :: STRING, options = {} :: MAP)

Query full-text index on relationships: db.index.fulltext.queryRelationships

CALL db.index.fulltext.queryRelationships(indexName :: STRING, queryString :: STRING, options = {} :: MAP)

The valid key: value pairs for the options map are:

• skip: <number> — skip the top N results.

• limit: <number> — limit the number of results returned.

• analyzer: <string> — use the specified analyzer as a search analyzer for this query.

The options map and all of the keys are optional.

For more information, see Full-text indexes - Query full-text indexes.

Vector indexes
Query vector-text index on nodes: db.index.vector.queryNodes New

CALL db.index.vector.queryNodes(indexName :: STRING, numberOfNearestNeighbours :: INTEGER, query ::


LIST<INTEGER | FLOAT>)

635
Query vector-text index on relationships: db.index.vector.queryRelationships New

CALL db.index.vector.queryRelationships(indexName :: STRING, numberOfNearestNeighbours :: INTEGER, query


:: LIST<INTEGER | FLOAT>)

The numberOfNearestNeighbours refers to the number of nearest neighbors to return as the neighborhood.
The query vector refers to the LIST<FLOAT> in which to search for the neighborhood.

For more information, see Vector indexes - Query vector indexes.

DROP INDEX
The DROP INDEX command can drop indexes of all types using their name. The name of the index can be
found using the SHOW INDEXES command, given in the output column name.

The DROP INDEX command is optionally idempotent. This means that its default behavior is to throw an
error if an attempt is made to drop the same index twice. With IF EXISTS, no error is thrown and nothing
happens should the index not exist. As of Neo4j 5.17, an informational notification is instead returned
detailing that the index does not exist.

 Dropping indexes requires the DROP INDEX privilege.

Drop an index of any index type

DROP INDEX index_name [IF EXISTS]

For more information, see Create, show, and delete indexes → DROP INDEX.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/cypher-indexes-
constraints/ad.adoc

[8] Index providers are essentially different implementations of the same index type. Different providers are only available for
text indexes.

[9] The example queries on this page are prepended with PROFILE. This both runs the query and generates its execution
plan. For more information, see Execution plans and query tuning → Note on PROFILE and EXPLAIN.

[10] The trackedSince column is not part of the default return columns for the SHOW INDEXES command. To return this and all
other non-default columns, use SHOW INDEXES YIELD *. For more information, see Create, show, and delete indexes → Result
columns for listing indexes.

[11] Lucene implements a Hierarchical Navigable Small World (HNSW) Graph to perform a k approximate nearest neighbors
(k-ANN) query over the vector fields. For more information, see Efficient and Robust Approximate Nearest Neighbor Search
Using Hierarchical Navigable Small World Graphs — Yury A. Malkov and Dmitry A. Yashunin

[12] IEEE Standard for Floating-Point Arithmetic

636
Constraints Include
Neo4j offers several constraints to ensure the quality and integrity of data in a graph. The following
constraints are available in Neo4j:

• Property uniqueness constraints ensure that the combined property values are unique for all nodes
with a specific label or all relationships with a specific type.

• Property existence constraints ensure that a property exists either for all nodes with a specific label or
for all relationships with a specific type. Enterprise edition

• Property type constraints ensure that a property has the required property type for all nodes with a
specific label or for all relationships with a specific type. New Enterprise edition

• Key constraints ensure that all properties exist and that the combined property values are unique for
all nodes with a specific label or all relationships with a specific type. Enterprise edition

To learn more about creating, listing, and dropping these constraints, as well as information about index-
backed constraints, constraint creation failures and data violation scenarios, and more, see Create, show,
and drop constraints Include .

For reference material about the Cypher commands used to manage constraints, see Syntax.

https://raw.githubusercontent.com/neo4j-graphacademy/courses/main/asciidoc/courses/cypher-indexes-
constraints/ad.adoc

Create, show, and drop constraints Include


This page describes how to create, list, and drop constraints. The following constraint types are available
in Neo4j:

• Property uniqueness constraints

• Property existence constraints Enterprise edition

• Property type constraints New Enterprise edition

• Key constraints Enterprise edition

CREATE CONSTRAINT
Constraints are created with the CREATE CONSTRAINT command. When creating a constraint, it is
recommended to provide a constraint name. This name must be unique among both indexes and
constraints. If a name is not explicitly given, a unique name will be auto-generated.

 Creating a constraint requires the CREATE CONSTRAINT privilege.

Adding constraints is an atomic operation that can take a while — all existing data has to
 be scanned before a Neo4j DBMS can use a constraint.

637
Create property uniqueness constraints
Property uniqueness constraints ensure that the property values are unique for all nodes with a specific
label or all relationships with a specific type. For composite property uniqueness constraints on multiple
properties, it is the combination of property values that must be unique. Queries that try to add duplicated
property values will fail.

Property uniqueness constraints do not require all nodes or relationships to have values for the properties
listed in the constraint. Only nodes or relationships that contain all properties specified in the constraint are
subject to the uniqueness rule. Nodes or relationships missing one or more of the specified properties are
not subject to this rule.

• Create a single property uniqueness constraint

• Create a composite property uniqueness constraint

• Create data that complies with existing property uniqueness constraints

Create a single property uniqueness constraint

Single property uniqueness constraints are created with the following commands:

• Node property uniqueness constraints: CREATE CONSTRAINT constraint_name FOR (n:Label) REQUIRE
n.property IS UNIQUE.

• Relationship property uniqueness constraints: CREATE CONSTRAINT constraint_name FOR ()-


[r:REL_TYPE]-() REQUIRE r.property IS UNIQUE. New

For the full command syntax to create a property uniqueness constraint, see Syntax → Create property
uniqueness constraints.

Example 313. Create a node property uniqueness constraint on a single property

Create a constraint requiring Book nodes to have unique isbn properties

CREATE CONSTRAINT book_isbn


FOR (book:Book) REQUIRE book.isbn IS UNIQUE

Result

Added 1 constraint.

The detailed statistics view currently says Unique constraints added: 1. It will be

 updated to say Node property uniqueness constraints added: 1 in a future


version of Neo4j.

638
Example 314. Create a relationship property uniqueness constraint on a single property New

Create a constraint requiring SEQUEL_OF relationships to have unique order properties

CREATE CONSTRAINT sequels


FOR ()-[sequel:SEQUEL_OF]-() REQUIRE sequel.order IS UNIQUE

Result

Added 1 constraint.

The detailed statistics view currently says Relationship uniqueness constraints

 added: 1. It will be updated to say Relationship property uniqueness


constraints added: 1 in a future version of Neo4j.

Create a composite property uniqueness constraint

Constraints created for multiple properties are called composite constraints. Note that the constrained
properties must be parenthesized when creating composite property uniqueness constraints.

• Node property uniqueness constraints: CREATE CONSTRAINT constraint_name FOR (n:Label) REQUIRE
(n.propertyName_1, …, n.propertyName_n) IS UNIQUE.

• Relationship property uniqueness constraints: CREATE CONSTRAINT constraint_name FOR ()-


[r:REL_TYPE]-() REQUIRE (r.propertyName_1, …, r.propertyName_n) IS UNIQUE. New

For the full command syntax to create a property uniqueness constraint, see Syntax → Create property
uniqueness constraints.

Example 315. Create a composite node property uniqueness constraint on several properties

Create a constraint requiring Book nodes to have unique combinations of title and publicationYear
properties

CREATE CONSTRAINT book_title_year


FOR (book:Book) REQUIRE (book.title, book.publicationYear) IS UNIQUE

Result

Added 1 constraint.

639
Example 316. Create a composite relationship property uniqueness constraint on several properties New

Create a constraint requiring PREQUEL_OF relationships to have unique combinations of order and
author properties

CREATE CONSTRAINT prequels


FOR ()-[prequel:PREQUEL_OF]-() REQUIRE (prequel.order, prequel.author) IS UNIQUE

Result

Added 1 constraint.

Create data that complies with existing property uniqueness constraints

Example 317. Create a node that complies with existing property uniqueness constraints

Create a Book node with a unique isbn property

CREATE (book:Book {isbn: '1449356265', title: 'Graph Databases'})

Result

Added 1 label, created 1 node, set 2 properties

Example 318. Create a relationship that complies with existing property uniqueness constraints

Create a SEQUEL_OF relationship with a unique order property

CREATE (:Book {title: 'Spirit Walker'})-[:SEQUEL_OF {order: 1, seriesTitle: 'Chronicles of Ancient


Darkness'}]->(:Book {title: 'Wolf Brother'})

Result

Added 2 labels, created 2 nodes, set 4 properties, created 1 relationship.

Create property existence constraints Label—enterprise edition


Property existence constraints ensure that a property exists either for all nodes with a specific label or for
all relationships with a specific type. Queries that try to create new nodes of the specified label, or
relationships of the specified type, without the constrained property will fail. The same is true for queries
that try to remove the mandatory property.

• Create a single property existence constraint

• Create data that complies with existing property existence constraints

640
Create a single property existence constraint

Property existence constraints on single properties are created with the following commands:

• Node property existence constraint: CREATE CONSTRAINT constraint_name FOR (n:Label) REQUIRE
n.property IS NOT NULL.

• Relationship property existence constraint: CREATE CONSTRAINT constraint_name FOR ()-


[r:REL_TYPE]-() REQUIRE r.property IS NOT NULL.

For the full command syntax to create an existence constraint, see Syntax → Create property existence
constraints.

 It is not possible to create composite existence constraints on several properties.

Example 319. Create a node property existence constraint

Create a constraint requiring Author nodes to have a name property

CREATE CONSTRAINT author_name


FOR (author:Author) REQUIRE author.name IS NOT NULL

Result

Added 1 constraint.

Example 320. Create a relationship property existence constraint

Create a constraint requiring WROTE relationships to have a year property

CREATE CONSTRAINT wrote_year


FOR ()-[wrote:WROTE]-() REQUIRE wrote.year IS NOT NULL

Result

Added 1 constraint.

Create data that complies with existing property existence constraints

641
Example 321. Create a node that complies with existing node property existence constraints

Create an Author node with a name property:

CREATE (author:Author {name:'Virginia Woolf', surname: 'Woolf'})

Result

Added 1 label, created 1 node, set 2 properties

Example 322. Create a relationship that complies with existing relationship property existence constraints

Create a WROTE relationship with a year property

CREATE (author:Author {name: 'Emily Brontë', surname: 'Brontë'})-[wrote:WROTE {year: 1847, location:
'Haworth, United Kingdom', published: true}]->(book:Book {title:'Wuthering Heights', isbn:
9789186579296})

Result

Added 2 labels, created 2 nodes, set 7 properties, created 1 relationship

Create property type constraints label:label—enterprise-edition label—new-5.9[]


Property type constraints ensure that a property has the required data type for all nodes with a specific
label or for all relationships with a specific type. Queries that attempt to add this property with the wrong
data type or modify this property in a way that changes its data type for nodes of the specified label or
relationships of the specified type will fail.

Property type constraints do not require all nodes or relationships to have the property. Nodes or
relationships without the constrained property are not subject to this rule.

• Create a single property type constraint

• Create property type constraints with a union type Label—new 5.11

• Allowed types

• Creating property type constraints on invalid types will fail

• Create data that complies with existing property type constraints

Create a single property type constraint

Property type constraints are created with the following commands:

• Node property type constraints: CREATE CONSTRAINT constraint_name FOR (n:Label) REQUIRE
n.property IS :: <TYPE>.

• Relationship property type constraints: CREATE CONSTRAINT constraint_name FOR ()-[r:REL_TYPE]-()


REQUIRE r.property IS :: <TYPE>.

642
<TYPE> refers to a specific Cypher data type, such as STRING or INTEGER. For the types that properties can
be constrained by, see Allowed types, and for information about different data types in Cypher, see Values
and types. For the full command syntax to create a property type constraint, see Syntax → Create property
type constraints.

 It is not possible to create composite property type constraints on several properties.

Example 323. Create a node property type constraint

Create a constraint requiring title properties on Movie nodes to be of type STRING

CREATE CONSTRAINT movie_title


FOR (movie:Movie) REQUIRE movie.title IS :: STRING

Result

Added 1 constraint.

Example 324. Create a relationship property type constraint

Create a constraint requiring order properties on PART_OF relationships to be of type INTEGER

CREATE CONSTRAINT part_of


FOR ()-[part:PART_OF]-() REQUIRE part.order IS :: INTEGER

Result

Added 1 constraint.

Create property type constraints with a union type Label—new 5.11

A closed dynamic union allows a node or relationship property to maintain some type flexibility whilst
preventing unexpected values from being stored.

Example 325. Create a node property type constraint with a union type

Create a constraint requiring tagline properties on Movie nodes to be either of type STRING or
LIST<STRING NOT NULL>

CREATE CONSTRAINT movie_tagline


FOR (movie:Movie) REQUIRE movie.tagline IS :: STRING | LIST<STRING NOT NULL>

Result

Added 1 constraint.

643
Example 326. Create a relationship property type constraint with a union type

Create a constraint requiring tags properties on PART_OF relationships to either of type STRING or
LIST<STRING NOT NULL>

CREATE CONSTRAINT part_of_tags


FOR ()-[part:PART_OF]-() REQUIRE part.tags IS :: STRING | LIST<STRING NOT NULL>

Result

Added 1 constraint.

Allowed types

The allowed property types for property type constraints are:

• BOOLEAN

• STRING

• INTEGER

• FLOAT

• DATE

• LOCAL TIME

• ZONED TIME

• LOCAL DATETIME

• ZONED DATETIME

• DURATION

• POINT

• LIST<BOOLEAN NOT NULL> New

• LIST<STRING NOT NULL> New

• LIST<INTEGER NOT NULL> New

• LIST<FLOAT NOT NULL> New

• LIST<DATE NOT NULL> New

• LIST<LOCAL TIME NOT NULL> New

• LIST<ZONED TIME NOT NULL> New

• LIST<LOCAL DATETIME NOT NULL> New

• LIST<ZONED DATETIME NOT NULL> New

• LIST<DURATION NOT NULL> New

• LIST<POINT NOT NULL> New

• Any closed dynamic union of the above types, e.g. INTEGER | FLOAT | STRING. New

644
For a complete reference describing all types available in Cypher, see the section on types and their
synonyms.

Creating property type constraints on invalid types will fail

Example 327. Create a node property type constraint with an invalid type

Create a constraint requiring imdbScore properties on Movie nodes to be of type MAP

CREATE CONSTRAINT score FOR (movie:Movie) REQUIRE movie.imdbScore IS :: MAP

Error message

Failed to create node property type constraint: Invalid property type `MAP`.

Create data that complies with existing property type constraints

Example 328. Create a node that complies with existing node property type constraint

Create an Movie node with a STRING title property

CREATE (movie:Movie {title:'Iron Man'})

Result

Added 1 label, created 1 node, set 1 properties

Example 329. Create a relationship that complies with existing relationship property type constraint

Create a PART_OF relationship with an INTEGER order property

MATCH (movie:Movie {title:'Iron Man'})


CREATE (movie)-[part:PART_OF {order: 3}]->(franchise:Franchise {name:'MCU'})

Result

Added 1 label, added 1 node, created 1 relationship, set 2 properties

Create key constraints Label—enterprise edition


Key constraints ensure that the property exist and the property value is unique for all nodes with a specific
label or all relationships with a specific type. For composite key constraints on multiple properties, all
properties must exists and the combination of property values must be unique.

Queries that try to create new nodes of the specified label, or relationships of the specified type, without

645
the constrained property will fail. The same is true for queries that try to remove the mandatory property or
add duplicated property values.

• Create a single property key constraint

• Create a composite key constraint

• Create data that complies with existing key constraints

Create a single property key constraint

Single property key constraints are created with the following commands:

• Node key constraints: CREATE CONSTRAINT constraint_name FOR (n:Label) REQUIRE n.property IS
NODE KEY.

• Relationship key constraints: CREATE CONSTRAINT constraint_name FOR ()-[r:REL_TYPE]-() REQUIRE


r.property IS RELATIONSHIP KEY. New

For the full command syntax to create a key constraint, see Syntax → Create key constraints.

Example 330. Create a node key constraint on a single property

Create a constraint requiring Director nodes to have a unique imdbId property as a node key.

CREATE CONSTRAINT director_imdbId


FOR (director:Director) REQUIRE (director.imdbId) IS NODE KEY

Result

Added 1 constraint.

Example 331. Create a relationship key constraint on a single property New

Create a constraint requiring OWNS relationships to have a unique ownershipId property as a


relationship key

CREATE CONSTRAINT ownershipId


FOR ()-[owns:OWNS]-() REQUIRE owns.ownershipId IS RELATIONSHIP KEY

Result

Added 1 constraint.

Create a composite key constraint

Constraints created for multiple properties are called composite constraints. Note that the constrained
properties must be parenthesized when creating composite key constraints.

646
Composite key constraints are created with the following commands:

• Node key constraints: CREATE CONSTRAINT constraint_name FOR (n:Label) REQUIRE


(n.propertyName_1, …, n.propertyName_n) IS NODE KEY.

• Relationship key constraints: CREATE CONSTRAINT constraint_name FOR ()-[r:REL_TYPE]-() REQUIRE


(r.propertyName_1, …, r.propertyName_n) IS RELATIONSHIP KEY. New

For the full command syntax to create a key constraint, see Syntax → Create key constraints.

Example 332. Create a composite node key constraint on multiple properties

Create a constraint requiring Actor nodes to have a unique combination of firstname and surname
properties as a node key

CREATE CONSTRAINT actor_fullname


FOR (actor:Actor) REQUIRE (actor.firstname, actor.surname) IS NODE KEY

Result

Added 1 constraint.

Example 333. Create a composite relationship key constraint label on multiple properties New

Create a constraint requiring KNOWS relationships to have a unique combination of since and how
properties as a relationship key

CREATE CONSTRAINT knows_since_how


FOR ()-[knows:KNOWS]-() REQUIRE (knows.since, knows.how) IS RELATIONSHIP KEY

Result

Added 1 constraint.

Create data that complies with existing key constraints

Example 334. Create a node that complies with existing node key constraints

Create an Actor node with unique firstname and surname properties

CREATE (actor:Actor {firstname: 'Keanu', surname: 'Reeves'})

Result

Added 1 label, created 1 node, set 2 properties.

647
Example 335. Create a relationship that complies with existing relationship key constraints

Create a KNOWS relationship with unique since and how properties

CREATE (:Actor {firstname: 'Jensen', surname: 'Ackles'})-[:KNOWS {since: 2008, how: 'coworkers',
friend: true}]->(:Actor {firstname: 'Misha', surname: 'Collins'})

Result

Added 2 labels, created 2 nodes, set 7 properties, created 1 relationship.

Create a constraint with a parameter Label—new 5.16


All constraint types can be created with a parameterized name.

Example 336. Create a node property uniqueness constraint using a parameter

Parameters

{
"name": "node_uniqueness_param"
}

Create a node property uniqueness constraint with a parameterized name

CREATE CONSTRAINT $name


FOR (book:Book) REQUIRE book.prop1 IS UNIQUE

Result

Added 1 constraint.

Example 337. Create a relationship property existence constraint using a parameter

Parameters

{
"name": "rel_exist_param"
}

Create a relationship property existence constraint with a parameterized name

CREATE CONSTRAINT $name


FOR ()-[wrote:WROTE]-() REQUIRE wrote.published IS NOT NULL

Result

Added 1 constraint.

648
Handling multiple constraints
Creating an already existing constraint will fail. This includes the following scenarios:

• Creating a constraint identical to an already existing constraint.

• Creating a constraint with a different name but on the same constraint type and same
label/relationship type and property combination as an already existing constraint. For property type
constraints the property type also needs to be the same.

• Creating a constraint with the same name as an already existing constraint, regardless of what that
constraint is.

Additionally, some constraints cannot coexist and attempting to create them together will therefore fail as
well. This includes:

• Property type constraints on the same label/relationship type and property but with different property
types.

• Property uniqueness and key constraints on the same label/relationship type and property
combination.

However, some constraint types are allowed on the same label/relationship type and property
combination. For example, it is possible to have a property uniqueness and a property existence constraint
on the same label/relationship type and property combination, though this would be the equivalent of
having a node or relationship key constraint. A more useful example would be to combine a property type
and a property existence constraint to ensure that the property exists and has the given type.

• Handling existing constraints when creating a constraint

• Creating an already existing constraint will fail

Handling existing constraints when creating a constraint

To avoid failing on existing constraints, IF NOT EXISTS can be added to the CREATE command. This will
ensure that no error is thrown and that no constraint is created if any other constraint with the given name,
or another constraint on the same constraint type and schema, or both, already exists. For property type
constraints the property type also needs to be the same. As of Neo4j 5.17, an informational notification is
instead returned showing the existing constraint which blocks the creation.

649
Example 338. Create a constraint identical to an existing constraint

Create a constraint requiring all SEQUEL_OF relationships to have unique order properties

CREATE CONSTRAINT sequels IF NOT EXISTS


FOR ()-[sequel:SEQUEL_OF]-() REQUIRE sequel.order IS UNIQUE

Because the same constraint already exists, nothing will happen:

Result

(no changes, no records)

Notification

`CREATE CONSTRAINT sequels IF NOT EXISTS FOR ()-[e:SEQUEL_OF]-() REQUIRE (e.order) IS UNIQUE` has no
effect.
`CONSTRAINT sequels FOR ()-[e:SEQUEL_OF]-() REQUIRE (e.order) IS UNIQUE` already exists.

Example 339. Create a relationship property uniqueness constraint when the same constraint with a
different name already exists

Create a constraint requiring all SEQUEL_OF relationships to have unique order properties

CREATE CONSTRAINT new_sequels IF NOT EXISTS


FOR ()-[sequel:SEQUEL_OF]-() REQUIRE sequel.order IS UNIQUE

Because a constraint with a different name (sequels) on the same schema exists, nothing will
happen:

Result

(no changes, no records)

Notification

`CREATE CONSTRAINT new_sequels IF NOT EXISTS FOR ()-[e:SEQUEL_OF]-() REQUIRE (e.order) IS UNIQUE` has
no effect.
`CONSTRAINT sequels FOR ()-[e:SEQUEL_OF]-() REQUIRE (e.order) IS UNIQUE` already exists.

650
Example 340. Create a relationship property uniqueness constraint with the same name as an existing
constraint of a different type

Create a constraint requiring all AUTHORED relationships to have unique name properties

CREATE CONSTRAINT author_name IF NOT EXISTS


FOR ()-[a:AUTHORED]-() REQUIRE a.name IS UNIQUE

Because a node property existence constraint named author_name already exists, nothing will
happen:

Result

(no changes, no records)

Notification

`CREATE CONSTRAINT author_name IF NOT EXISTS FOR ()-[e:AUTHORED]-() REQUIRE (e.name) IS UNIQUE` has
no effect.
`CONSTRAINT author_name FOR (e:Author) REQUIRE (e.name) IS NOT NULL` already exists.

Creating an already existing constraint will fail

Creating a constraint with the same name or on the same node label or relationship type and properties
that are already constrained by a constraint of the same type will fail. Property uniqueness and key
constraints are also not allowed on the same schema.

Example 341. Create a constraint identical to an existing constraint

Create a constraint requiring all SEQUEL_OF relationships to have unique order properties, given an
identical constraint already exists

CREATE CONSTRAINT sequels


FOR ()-[sequel:SEQUEL_OF]-() REQUIRE sequel.order IS UNIQUE

Error message

An equivalent constraint already exists, 'Constraint( id=5, name='sequels', type='RELATIONSHIP


UNIQUENESS', schema=()-[:SEQUEL_OF {order}]-(), ownedIndex=4 )'.

The constraint type will be updated to say RELATIONSHIP PROPERTY UNIQUENESS in a


 future version of Neo4j.

651
Example 342. Create a constraint with a different name but on the same schema as an existing constraint

Create a constraint requiring all Book nodes to have unique isbn properties, given that a constraint on
that schema already exists

CREATE CONSTRAINT new_book_isbn


FOR (book:Book) REQUIRE book.isbn IS UNIQUE

Error message

Constraint already exists: Constraint( id=3, name='book_isbn', type='UNIQUENESS', schema=(:Book


{isbn}), ownedIndex=2 )

The constraint type will be updated to say NODE PROPERTY UNIQUENESS in a future
 version of Neo4j.

Example 343. Creating a constraint with the same name but on a different schema as an existing
constraint

Create a constraint requiring all AUTHORED relationships to have unique name properties, given that a
constraint on a different schema with the same name already exists

CREATE CONSTRAINT author_name


FOR ()-[a:AUTHORED]-() REQUIRE a.name IS UNIQUE

Error message

There already exists a constraint called 'author_name'.

Example 344. Creating a property type constraint on a property when a property type constraint
constraining the property to a different type already exist

Create a constraint requiring order properties on PART_OF relationships to be of type FLOAT, given a
constraint requiring the same properties to be of type INTEGER already exists

CREATE CONSTRAINT new_part_of


FOR ()-[part:PART_OF]-() REQUIRE part.order IS :: FLOAT

Error message

Conflicting constraint already exists: Constraint( id=21, name='part_of', type='RELATIONSHIP PROPERTY


TYPE', schema=()-[:PART_OF {order}]-(), propertyType=INTEGER )

652
Example 345. Creating a node key constraint on the same schema as an existing property uniqueness
constraint

Create a node key constraint on the properties title and publicationYear on nodes with the Book
label, when a property uniqueness constraint already exists on the same label and property
combination

CREATE CONSTRAINT book_titles FOR (book:Book) REQUIRE (book.title, book.publicationYear) IS NODE KEY

Error message

Constraint already exists: Constraint( id=7, name='book_title_year', type='UNIQUENESS', schema=(:Book


{title, publicationYear}), ownedIndex=6 )

Constraints and indexes


• Constraints and backing indexes

• Creating constraints with an index provider

• Constraint failures and indexes

Constraints and backing indexes

Property uniqueness constraints and key constraints are backed by range indexes. This means that
creating a property uniqueness or key constraint will create a range index with the same name, node
label/relationship type and property combination as its owning constraint. Single property constraints will
create single property indexes and multiple property composite constraints will create composite indexes.

Indexes of the same index type, label/relationship type, and property combination cannot
be added separately. However, dropping a property uniqueness or key constraint will
 also drop its backing index. If the backing index is still required, the index needs to be
explicitly re-created.

Property uniqueness and key constraints require an index because it allows the system to quickly check if
a node with the same label and property value or a relationship with the same type and property value
already exists. Without an index, the system would need to scan all nodes with the same label, which
would be slow and inefficient, especially as the graph grows. The index makes these checks much faster
by enabling direct lookups instead of scanning the entire graph. Cypher will use the indexes with an
owning constraint in the same way that it utilizes other search-performance indexes. For more information
about how indexes impact query performance, see The impact of indexes on query performance.

These indexes are listed in the owningConstraint column returned by the SHOW INDEX command, and the
ownedIndex column returned by the SHOW CONSTRAINT command.

653
Example 346. List constraints with backing indexes

Query

SHOW CONSTRAINTS WHERE ownedIndex IS NOT NULL

Result

+----------------------------------------------------------------------------------------------------
----------------------------------------------------------------+
| id | name | type | entityType | labelsOrTypes |
properties | ownedIndex | propertyType |
+----------------------------------------------------------------------------------------------------
----------------------------------------------------------------+
| 21 | "actor_fullname" | "NODE_KEY" | "NODE" | ["Actor"] |
["firstname", "surname"] | "actor_fullname" | NULL |
| 3 | "book_isbn" | "UNIQUENESS" | "NODE" | ["Book"] |
["isbn"] | "book_isbn" | NULL |
| 7 | "book_title_year" | "UNIQUENESS" | "NODE" | ["Book"] |
["title", "publicationYear"] | "book_title_year" | NULL |
| 17 | "director_imdbId" | "NODE_KEY" | "NODE" | ["Director"] |
["imdbId"] | "director_imdbId" | NULL |
| 23 | "knows_since_how" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["KNOWS"] |
["since", "how"] | "knows_since_how" | NULL |
| 25 | "node_uniqueness_param" | "UNIQUENESS" | "NODE" | ["Book"] |
["prop1"] | "node_uniqueness_param" | NULL |
| 19 | "ownershipId" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["OWNS"] |
["ownershipId"] | "ownershipId" | NULL |
| 9 | "prequels" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" | ["PREQUEL_OF"] |
["order", "author"] | "prequels" | NULL |
| 5 | "sequels" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" | ["SEQUEL_OF"] |
["order"] | "sequels" | NULL |
+----------------------------------------------------------------------------------------------------
----------------------------------------------------------------+

654
Example 347. List indexes with owning constraints

Query

SHOW INDEXES WHERE owningConstraint IS NOT NULL

Result

+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
----------------+
| id | name | state | populationPercent | type | entityType |
labelsOrTypes | properties | indexProvider | owningConstraint | lastRead
| readCount |
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
----------------+
| 20 | "actor_fullname" | "ONLINE" | 100.0 | "RANGE" | "NODE" | ["Actor"]
| ["firstname", "surname"] | "range-1.0" | "actor_fullname" | 2024-10-07T12:12:51.893Z |
3 |
| 2 | "book_isbn" | "ONLINE" | 100.0 | "RANGE" | "NODE" | ["Book"]
| ["isbn"] | "range-1.0" | "book_isbn" | 2024-10-07T11:58:09.252Z |
2 |
| 6 | "book_title_year" | "ONLINE" | 100.0 | "RANGE" | "NODE" | ["Book"]
| ["title", "publicationYear"] | "range-1.0" | "book_title_year" | NULL |
0 |
| 16 | "director_imdbId" | "ONLINE" | 100.0 | "RANGE" | "NODE" |
["Director"] | ["imdbId"] | "range-1.0" | "director_imdbId" | NULL
| 0 |
| 22 | "knows_since_how" | "ONLINE" | 100.0 | "RANGE" | "RELATIONSHIP" | ["KNOWS"]
| ["since", "how"] | "range-1.0" | "knows_since_how" | 2024-10-07T12:12:51.894Z |
1 |
| 24 | "node_uniqueness_param" | "ONLINE" | 100.0 | "RANGE" | "NODE" | ["Book"]
| ["prop1"] | "range-1.0" | "node_uniqueness_param" | NULL |
0 |
| 18 | "ownershipId" | "ONLINE" | 100.0 | "RANGE" | "RELATIONSHIP" | ["OWNS"]
| ["ownershipId"] | "range-1.0" | "ownershipId" | NULL |
0 |
| 8 | "prequels" | "ONLINE" | 100.0 | "RANGE" | "RELATIONSHIP" |
["PREQUEL_OF"] | ["order", "author"] | "range-1.0" | "prequels" | NULL
| 0 |
| 4 | "sequels" | "ONLINE" | 100.0 | "RANGE" | "RELATIONSHIP" |
["SEQUEL_OF"] | ["order"] | "range-1.0" | "sequels" | 2024-10-
07T11:57:12.999Z | 1 |
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
----------------+

 Property existence and property type constraints are not backed by indexes.

Creating constraints with an index provider

Because property uniqueness and key constraints have backing indexes, an index provider can be
provided when creating these constraints using the OPTIONS clause and the indexProvider option.

The only valid value for the index provider is:

• range-1.0 Default

655
Example 348. Create a node key constraint with a specified index provider

Create a constraint requiring Actor nodes to have a unique surname property as a node key,
specifying range-1.0 as index provider

CREATE CONSTRAINT constraint_with_provider


FOR (actor:Actor) REQUIRE actor.surname IS NODE KEY
OPTIONS {
indexProvider: 'range-1.0'
}

Result

Added 1 constraint.

Example 349. Create a relationship property uniqueness constraint with a specified index provider

Create a constraint requiring SEQUEL_OF relationships to have a unique combination of order,


seriesTitle, and number properties, specifying range-1.0 as index provider

CREATE CONSTRAINT rel_constraint_with_options


FOR ()-[sequel:SEQUEL_OF]-() REQUIRE (sequel.order, sequel.seriesTitle, sequel.number) IS UNIQUE
OPTIONS {
indexProvider: 'range-1.0'
}

Result

Added 1 constraint.

There are no valid index configuration values for the constraint-backing range indexes.

Constraint failures and indexes

Attempting to create any type of constraint with the same name as an existing index will fail.

Example 350. Creating a node property type constraint with the same name as an existing index

Create an index with the name directors

CREATE INDEX directors FOR (director:Director) ON (director.name)

Create a node property type constraint with the name directors

CREATE CONSTRAINT directors FOR (movie:Movie) REQUIRE movie.director IS :: STRING

Error message

There already exists an index called 'directors'.

656
Creating key or property uniqueness constraints on the same schema as an existing index will fail.

Example 351. Creating a node property uniqueness constraint on the same schema as an existing index

Create an index for wordCount properties on Book nodes

CREATE INDEX book_word_count FOR (book:Book) ON (book.wordCount)

Create a constraint requiring all Book nodes to have unique wordCount properties

CREATE CONSTRAINT word_count FOR (book:Book) REQUIRE book.wordCount IS UNIQUE

Error message

There already exists an index (:Book {wordCount}).


A constraint cannot be created until the index has been dropped.

Constraints and data violation scenarios


• Creating data that violates existing constraints will fail

• Removing existence and key constrained properties will fail

• Modifying type constrained properties will fail

• Creating constraints when there exists conflicting data will fail

Creating data that violates existing constraints will fail

Existing constraints preventing data creation

Constraint type Create nodes and Create nodes and Create nodes and
relationships without an relationships with non- relationships with the wrong
existence constrained unique properties/property property type
property combinations

Property uniqueness &#x274C;


constraint

Property existence constraint &#x274C;

Property type constraint &#x274C;

Key constraint &#x274C; &#x274C;

657
Example 352. Create a node that violates a node property uniqueness constraint

Create a Book node with an isbn property that already exists

CREATE (book:Book {isbn: '1449356265', title: 'Graph Databases'})

Error message

Node(0) already exists with label `Book` and property `isbn` = '1449356265'

Example 353. Create a node that violates an existing node property existence constraint

Create an Author node without a name property, given a property existence constraint on
:Author(name)

CREATE (author:Author {surname: 'Austen'})

Error message

Node(0) with label `Author` must have the property `name`

Example 354. Create a relationship that violates an existing relationship property type constraint

Create a PART_OF relationship with a STRING order property, given a property type constraint on the
relationship type PART_OF restricting the order property to INTEGER values

MATCH (movie:Movie {title:'Iron Man'}), (franchise:Franchise {name:'MCU'})


CREATE (movie)-[part:PART_OF {order: '1'}]->(franchise)

Error message

Relationship(0) with type `PART_OF` has property `order` of wrong type `String`. Allowed types:
INTEGER

Example 355. Create a node that violates an existing node key constraint

Create an Actor node without a firstname property, given a node key constraint on
:Actor(firstname, surname)

CREATE (actor:Actor {surname: 'Wood'})

Error message

Node(0) with label `Actor` must have the properties (`firstname`, `surname`)

658
Removing existence and key constrained properties will fail

Example 356. Remove a node property existence constrained property

Remove the name property from an existing Author node, given a property existence constraint on
:Author(name)

MATCH (author:Author {name: 'Virginia Woolf'})


REMOVE author.name

Error message

Node(0) with label `Author` must have the property `name`

Example 357. Remove a node key constrained property

Remove the firstname property from an existing node Actor, given a node key constraint on
:Actor(firstname, surname)

MATCH (actor:Actor {firstname: 'Keanu', surname: 'Reeves'})


REMOVE actor.firstname

Error message

Node(0) with label `Actor` must have the properties (`firstname`, `surname`)

Modifying type constrained properties will fail

Example 358. Modify a type constrained property

Modify the title for the Movie 'Iron Man' to an INTEGER value, given a constraint requiring title
properties to be of type STRING

MATCH (m:Movie {title: 'Iron Man'})


SET m.title = 13

Error message

Node(9) with label `Movie` required the property `title` to be of type `STRING`, but was of type
`INTEGER`.

Creating constraints when there exists conflicting data will fail

Existing data preventing constraint creation

659
Constraint type Non-existing property Non-unique Property of wrong type
property/property
combination

Property uniqueness &#x274C;


constraint

Property existence constraint &#x274C;

Property type constraint &#x274C;

Key constraint &#x274C; &#x274C;

Example 359. Create a node property uniqueness constraint when conflicting nodes exist

Create two Book nodes with the same name property value

CREATE (:Book {isbn: '9780393972832', title: 'Moby Dick'}),


(:Book {isbn: '9780763630188', title: 'Moby Dick'})

Create a constraint requiring Book nodes to have unique title properties, when there already exists
two Book nodes with the same title

CREATE CONSTRAINT book_title FOR (book:Book) REQUIRE book.title IS UNIQUE

In this case, the constraint cannot be created because it is in conflict with the existing graph. Either
use indexes instead, or remove/correct the offending nodes and then re-apply the constraint.

Error message

Unable to create Constraint( name='book_title', type='UNIQUENESS', schema=(:Book {title}) ):


Both Node(0) and Node(1) have the label `Book` and property `title` = 'Moby Dick'

The constraint creation fails on the first offending nodes that are found. This does not guarantee that
there are no other offending nodes in the graph. Therefore, all the data should be checked and
cleaned up before re-attempting the constraint creation.

Find all offending nodes with the non-unique property values for the constraint above

MATCH (book1:Book), (book2:Book)


WHERE book1.title = book2.title AND NOT book1 = book2
RETURN book1, book2

660
Example 360. Create a relationship property existence constraint when conflicting relationships exist

Create a constraint requiring all WROTE relationships to have a language property, when there already
exists a WROTE relationship without a language property

CREATE CONSTRAINT wrote_language FOR ()-[wrote:WROTE]-() REQUIRE wrote.language IS NOT NULL

In this case, the constraint cannot be created because it is in conflict with the existing graph. Remove
or correct the offending relationships and then re-apply the constraint.

Error message

Unable to create Constraint( type='RELATIONSHIP PROPERTY EXISTENCE', schema=()-[:WROTE {language}]-()


):
Relationship(0) with type `WROTE` must have the property `language`. Note that only the first found
violation is shown.

The constraint creation fails on the first offending relationship that is found. This does not guarantee
that there are no other offending relationships in the graph. Therefore, all the data should be checked
and cleaned up before re-attempting the constraint creation.

Find all offending relationships missing the property for the constraint above

MATCH ()-[wrote:WROTE]-()
WHERE wrote.language IS NULL
RETURN wrote

Generic MATCH queries to find the properties preventing the creation of particular constraints:

Constraint Query

Node property uniqueness constraint


MATCH (n1:Label), (n2:Label)
WHERE n1.prop = n2.prop AND NOT n1 = n2
RETURN n1, n2

Relationship property uniqueness constraint


MATCH ()-[r1:REL_TYPE]->(), ()-[r2:REL_TYPE]->()
WHERE r1.prop = r2.prop AND NOT r1 = r2
RETURN r1, r2

Node property existence constraint


MATCH (n:Label)
WHERE n.prop IS NULL
RETURN n

Relationship property existence constraint


MATCH ()-[r:REL_TYPE]->()
WHERE r.prop IS NULL
RETURN r

661
Constraint Query

Node property type constraint


MATCH (n:Label)
WHERE n.prop IS NOT :: <TYPE>
RETURN n

Relationship property type constraint


MATCH ()-[r:REL_TYPE]->()
WHERE r.prop IS NOT :: <TYPE>
RETURN r

Node key constraint


MATCH (n1:Label), (n2:Label)
WHERE n1.prop = n2.prop AND NOT n1 = n2
UNWIND [n1, n2] AS node
RETURN node, 'non-unique' AS reason
UNION
MATCH (n:Label)
WHERE n.prop IS NULL
RETURN n AS node, 'non-existing' AS reason

Relationship key constraint


MATCH ()-[r1:REL_TYPE]->(), ()-[r2:REL_TYPE]->()
WHERE r1.prop = r2.prop AND NOT r1 = r2
UNWIND [r1, r2] AS relationship
RETURN relationship, 'non-unique' AS reason
UNION
MATCH ()-[r:REL_TYPE]->()
WHERE r.prop IS NULL
RETURN r AS relationship, 'non-existing' AS reason

SHOW CONSTRAINTS
To list all constraints with the default output columns, use SHOW CONSTRAINTS. If all columns are required,
use SHOW CONSTRAINTS YIELD *. For the full command syntax to list constraints, see Syntax → SHOW
CONSTRAINTS.

One of the output columns from SHOW CONSTRAINTS is the name of the constraint. This can be used to drop
the constraint with the DROP CONSTRAINT command.

 Listing constraints requires the SHOW CONSTRAINTS privilege.

662
Example 361. List all constraints with default output columns

Query

SHOW CONSTRAINTS

663
Result

+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------+
| id | name | type | entityType |
labelsOrTypes | properties | ownedIndex | propertyType
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------+
| 21 | "actor_fullname" | "NODE_KEY" | "NODE" | ["Actor"]
| ["firstname", "surname"] | "actor_fullname" | NULL
|
| 10 | "author_name" | "NODE_PROPERTY_EXISTENCE" | "NODE" |
["Author"] | ["name"] | NULL | NULL
|
| 3 | "book_isbn" | "UNIQUENESS" | "NODE" | ["Book"]
| ["isbn"] | "book_isbn" | NULL
|
| 7 | "book_title_year" | "UNIQUENESS" | "NODE" | ["Book"]
| ["title", "publicationYear"] | "book_title_year" | NULL
|
| 28 | "constraint_with_provider" | "NODE_KEY" | "NODE" | ["Actor"]
| ["surname"] | "constraint_with_provider" | NULL
|
| 17 | "director_imdbId" | "NODE_KEY" | "NODE" |
["Director"] | ["imdbId"] | "director_imdbId" | NULL
|
| 23 | "knows_since_how" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["KNOWS"]
| ["since", "how"] | "knows_since_how" | NULL
|
| 14 | "movie_tagline" | "NODE_PROPERTY_TYPE" | "NODE" | ["Movie"]
| ["tagline"] | NULL | "STRING | LIST<STRING NOT
NULL>" |
| 12 | "movie_title" | "NODE_PROPERTY_TYPE" | "NODE" | ["Movie"]
| ["title"] | NULL | "STRING"
|
| 25 | "node_uniqueness_param" | "UNIQUENESS" | "NODE" | ["Book"]
| ["prop1"] | "node_uniqueness_param" | NULL
|
| 19 | "ownershipId" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["OWNS"]
| ["ownershipId"] | "ownershipId" | NULL
|
| 13 | "part_of" | "RELATIONSHIP_PROPERTY_TYPE" | "RELATIONSHIP" |
["PART_OF"] | ["order"] | NULL | "INTEGER"
|
| 15 | "part_of_tags" | "RELATIONSHIP_PROPERTY_TYPE" | "RELATIONSHIP" |
["PART_OF"] | ["tags"] | NULL | "STRING |
LIST<STRING NOT NULL>" |
| 9 | "prequels" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["PREQUEL_OF"] | ["order", "author"] | "prequels" | NULL
|
| 30 | "rel_constraint_with_options" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["SEQUEL_OF"] | ["order", "seriesTitle", "number"] | "rel_constraint_with_options" | NULL
|
| 26 | "rel_exist_param" | "RELATIONSHIP_PROPERTY_EXISTENCE" | "RELATIONSHIP" | ["WROTE"]
| ["published"] | NULL | NULL
|
| 5 | "sequels" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["SEQUEL_OF"] | ["order"] | "sequels" | NULL
|
| 11 | "wrote_year" | "RELATIONSHIP_PROPERTY_EXISTENCE" | "RELATIONSHIP" | ["WROTE"]
| ["year"] | NULL | NULL
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------+

664
Example 362. List all constraints with full details

To return the full details of the constraints on a database, use SHOW CONSTRAINTS YIELD *

List all constraints with YIELD *

SHOW CONSTRAINTS YIELD *

Result

+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------+
| id | name | type | entityType |
labelsOrTypes | properties | ownedIndex | propertyType
| options | createStatement
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------+
| 21 | "actor_fullname" | "NODE_KEY" | "NODE" | ["Actor"]
| ["firstname", "surname"] | "actor_fullname" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `actor_fullname` FOR (n:`Actor`)
REQUIRE (n.`firstname`, n.`surname`) IS NODE KEY" |
| 10 | "author_name" | "NODE_PROPERTY_EXISTENCE" | "NODE" |
["Author"] | ["name"] | NULL | NULL
| NULL | "CREATE CONSTRAINT `author_name` FOR (n:`Author`)
REQUIRE (n.`name`) IS NOT NULL" |
| 3 | "book_isbn" | "UNIQUENESS" | "NODE" | ["Book"]
| ["isbn"] | "book_isbn" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `book_isbn` FOR (n:`Book`)
REQUIRE (n.`isbn`) IS UNIQUE" |
| 7 | "book_title_year" | "UNIQUENESS" | "NODE" | ["Book"]
| ["title", "publicationYear"] | "book_title_year" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `book_title_year` FOR (n:`Book`)
REQUIRE (n.`title`, n.`publicationYear`) IS UNIQUE" |
| 28 | "constraint_with_provider" | "NODE_KEY" | "NODE" | ["Actor"]
| ["surname"] | "constraint_with_provider" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `constraint_with_provider` FOR
(n:`Actor`) REQUIRE (n.`surname`) IS NODE KEY" |
| 17 | "director_imdbId" | "NODE_KEY" | "NODE" |
["Director"] | ["imdbId"] | "director_imdbId" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `director_imdbId` FOR
(n:`Director`) REQUIRE (n.`imdbId`) IS NODE KEY" |
| 23 | "knows_since_how" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["KNOWS"]
| ["since", "how"] | "knows_since_how" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `knows_since_how` FOR ()-
[r:`KNOWS`]-() REQUIRE (r.`since`, r.`how`) IS RELATIONSHIP KEY" |
| 14 | "movie_tagline" | "NODE_PROPERTY_TYPE" | "NODE" | ["Movie"]
| ["tagline"] | NULL | "STRING | LIST<STRING NOT
NULL>" | NULL | "CREATE CONSTRAINT `movie_tagline` FOR
(n:`Movie`) REQUIRE (n.`tagline`) IS :: STRING | LIST<STRING NOT NULL>" |
| 12 | "movie_title" | "NODE_PROPERTY_TYPE" | "NODE" | ["Movie"]
| ["title"] | NULL | "STRING"
| NULL | "CREATE CONSTRAINT `movie_title` FOR (n:`Movie`)
REQUIRE (n.`title`) IS :: STRING" |
| 25 | "node_uniqueness_param" | "UNIQUENESS" | "NODE" | ["Book"]
| ["prop1"] | "node_uniqueness_param" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `node_uniqueness_param` FOR
(n:`Book`) REQUIRE (n.`prop1`) IS UNIQUE" |
| 19 | "ownershipId" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["OWNS"]
| ["ownershipId"] | "ownershipId" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `ownershipId` FOR ()-[r:`OWNS`]-
() REQUIRE (r.`ownershipId`) IS RELATIONSHIP KEY" |
| 13 | "part_of" | "RELATIONSHIP_PROPERTY_TYPE" | "RELATIONSHIP" |
["PART_OF"] | ["order"] | NULL | "INTEGER"
| NULL | "CREATE CONSTRAINT `part_of` FOR ()-[r:`PART_OF`]-
() REQUIRE (r.`order`) IS :: INTEGER" |
| 15 | "part_of_tags" | "RELATIONSHIP_PROPERTY_TYPE" | "RELATIONSHIP" |
["PART_OF"] | ["tags"] | NULL | "STRING |
LIST<STRING NOT NULL>" | NULL | "CREATE CONSTRAINT
`part_of_tags` FOR ()-[r:`PART_OF`]-() REQUIRE (r.`tags`) IS :: STRING | LIST<STRING NOT NULL>"

665
|
| 9 | "prequels" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["PREQUEL_OF"] | ["order", "author"] | "prequels" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `prequels` FOR ()-
[r:`PREQUEL_OF`]-() REQUIRE (r.`order`, r.`author`) IS UNIQUE" |
| 30 | "rel_constraint_with_options" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["SEQUEL_OF"] | ["order", "seriesTitle", "number"] | "rel_constraint_with_options" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `rel_constraint_with_options`
FOR ()-[r:`SEQUEL_OF`]-() REQUIRE (r.`order`, r.`seriesTitle`, r.`number`) IS UNIQUE" |
| 26 | "rel_exist_param" | "RELATIONSHIP_PROPERTY_EXISTENCE" | "RELATIONSHIP" | ["WROTE"]
| ["published"] | NULL | NULL
| NULL | "CREATE CONSTRAINT `rel_exist_param` FOR ()-
[r:`WROTE`]-() REQUIRE (r.`published`) IS NOT NULL" |
| 5 | "sequels" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["SEQUEL_OF"] | ["order"] | "sequels" | NULL
| {indexConfig: {}, indexProvider: "range-1.0"} | "CREATE CONSTRAINT `sequels` FOR ()-
[r:`SEQUEL_OF`]-() REQUIRE (r.`order`) IS UNIQUE" |
| 11 | "wrote_year" | "RELATIONSHIP_PROPERTY_EXISTENCE" | "RELATIONSHIP" | ["WROTE"]
| ["year"] | NULL | NULL
| NULL | "CREATE CONSTRAINT `wrote_year` FOR ()-[r:`WROTE`]-
() REQUIRE (r.`year`) IS NOT NULL" |
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------+

The type column returns UNIQUENESS for the node property uniqueness constraint and
RELATIONSHIP_UNIQUENESS for the relationship property uniqueness constraint. This will

 be updated in a future version of Neo4j. Node property uniqueness constraints will be


updated to NODE_PROPERTY_UNIQUENESS and relationship property uniqueness constraints
to RELATIONSHIP_PROPERTY_UNIQUENESS.

Listing constraints with filtering


The SHOW CONSTRAINTS command can be filtered in various ways. The filtering of rows can be done using
constraint type keywords or a WHERE clause, while filtering of columns is achieved by specifying the desired
columns in a YIELD clause.

666
Example 363. List only specific constraint types

List only key constraints

SHOW KEY CONSTRAINTS

Result

+----------------------------------------------------------------------------------------------------
----------------------------------------------------------+
| id | name | type | entityType | labelsOrTypes | properties
| ownedIndex | propertyType |
+----------------------------------------------------------------------------------------------------
----------------------------------------------------------+
| 21 | "actor_fullname" | "NODE_KEY" | "NODE" | ["Actor"] |
["firstname", "surname"] | "actor_fullname" | NULL |
| 28 | "constraint_with_provider" | "NODE_KEY" | "NODE" | ["Actor"] | ["surname"]
| "constraint_with_provider" | NULL |
| 17 | "director_imdbId" | "NODE_KEY" | "NODE" | ["Director"] | ["imdbId"]
| "director_imdbId" | NULL |
| 23 | "knows_since_how" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["KNOWS"] | ["since",
"how"] | "knows_since_how" | NULL |
| 19 | "ownershipId" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["OWNS"] |
["ownershipId"] | "ownershipId" | NULL |
+----------------------------------------------------------------------------------------------------
----------------------------------------------------------+

For a full list of all the constraint types (and synonyms) available in this command see Syntax →
SHOW CONSTRAINTS.

667
Example 364. Filtering constraints using the WHERE clause

List only constraints with a RELATIONSHIP entityType

SHOW CONSTRAINTS
WHERE entityType = 'RELATIONSHIP'

Result

+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------+
| id | name | type | entityType |
labelsOrTypes | properties | ownedIndex | propertyType
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------+
| 23 | "knows_since_how" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["KNOWS"]
| ["since", "how"] | "knows_since_how" | NULL
|
| 19 | "ownershipId" | "RELATIONSHIP_KEY" | "RELATIONSHIP" | ["OWNS"]
| ["ownershipId"] | "ownershipId" | NULL
|
| 13 | "part_of" | "RELATIONSHIP_PROPERTY_TYPE" | "RELATIONSHIP" |
["PART_OF"] | ["order"] | NULL | "INTEGER"
|
| 15 | "part_of_tags" | "RELATIONSHIP_PROPERTY_TYPE" | "RELATIONSHIP" |
["PART_OF"] | ["tags"] | NULL | "STRING |
LIST<STRING NOT NULL>" |
| 9 | "prequels" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["PREQUEL_OF"] | ["order", "author"] | "prequels" | NULL
|
| 30 | "rel_constraint_with_options" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["SEQUEL_OF"] | ["order", "seriesTitle", "number"] | "rel_constraint_with_options" | NULL
|
| 26 | "rel_exist_param" | "RELATIONSHIP_PROPERTY_EXISTENCE" | "RELATIONSHIP" | ["WROTE"]
| ["published"] | NULL | NULL
|
| 5 | "sequels" | "RELATIONSHIP_UNIQUENESS" | "RELATIONSHIP" |
["SEQUEL_OF"] | ["order"] | "sequels" | NULL
|
| 11 | "wrote_year" | "RELATIONSHIP_PROPERTY_EXISTENCE" | "RELATIONSHIP" | ["WROTE"]
| ["year"] | NULL | NULL
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---------+

668
Example 365. Returning specific columns for all constraints

It is possible to return only specific columns of the available constraints using the YIELD clause:

List only the name, type, and createStatement columns

SHOW CONSTRAINTS
YIELD name, type, createStatement

669
Result

+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---+
| name | type | createStatement
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---+
| "actor_fullname" | "NODE_KEY" | "CREATE CONSTRAINT
`actor_fullname` FOR (n:`Actor`) REQUIRE (n.`firstname`, n.`surname`) IS NODE KEY"
|
| "author_name" | "NODE_PROPERTY_EXISTENCE" | "CREATE CONSTRAINT
`author_name` FOR (n:`Author`) REQUIRE (n.`name`) IS NOT NULL"
|
| "book_isbn" | "UNIQUENESS" | "CREATE CONSTRAINT `book_isbn`
FOR (n:`Book`) REQUIRE (n.`isbn`) IS UNIQUE"
|
| "book_title_year" | "UNIQUENESS" | "CREATE CONSTRAINT
`book_title_year` FOR (n:`Book`) REQUIRE (n.`title`, n.`publicationYear`) IS UNIQUE"
|
| "constraint_with_provider" | "NODE_KEY" | "CREATE CONSTRAINT
`constraint_with_provider` FOR (n:`Actor`) REQUIRE (n.`surname`) IS NODE KEY"
|
| "director_imdbId" | "NODE_KEY" | "CREATE CONSTRAINT
`director_imdbId` FOR (n:`Director`) REQUIRE (n.`imdbId`) IS NODE KEY"
|
| "knows_since_how" | "RELATIONSHIP_KEY" | "CREATE CONSTRAINT
`knows_since_how` FOR ()-[r:`KNOWS`]-() REQUIRE (r.`since`, r.`how`) IS RELATIONSHIP KEY"
|
| "movie_tagline" | "NODE_PROPERTY_TYPE" | "CREATE CONSTRAINT
`movie_tagline` FOR (n:`Movie`) REQUIRE (n.`tagline`) IS :: STRING | LIST<STRING NOT NULL>"
|
| "movie_title" | "NODE_PROPERTY_TYPE" | "CREATE CONSTRAINT
`movie_title` FOR (n:`Movie`) REQUIRE (n.`title`) IS :: STRING"
|
| "node_uniqueness_param" | "UNIQUENESS" | "CREATE CONSTRAINT
`node_uniqueness_param` FOR (n:`Book`) REQUIRE (n.`prop1`) IS UNIQUE"
|
| "ownershipId" | "RELATIONSHIP_KEY" | "CREATE CONSTRAINT
`ownershipId` FOR ()-[r:`OWNS`]-() REQUIRE (r.`ownershipId`) IS RELATIONSHIP KEY"
|
| "part_of" | "RELATIONSHIP_PROPERTY_TYPE" | "CREATE CONSTRAINT `part_of`
FOR ()-[r:`PART_OF`]-() REQUIRE (r.`order`) IS :: INTEGER"
|
| "part_of_tags" | "RELATIONSHIP_PROPERTY_TYPE" | "CREATE CONSTRAINT
`part_of_tags` FOR ()-[r:`PART_OF`]-() REQUIRE (r.`tags`) IS :: STRING | LIST<STRING NOT NULL>"
|
| "prequels" | "RELATIONSHIP_UNIQUENESS" | "CREATE CONSTRAINT `prequels`
FOR ()-[r:`PREQUEL_OF`]-() REQUIRE (r.`order`, r.`author`) IS UNIQUE"
|
| "rel_constraint_with_options" | "RELATIONSHIP_UNIQUENESS" | "CREATE CONSTRAINT
`rel_constraint_with_options` FOR ()-[r:`SEQUEL_OF`]-() REQUIRE (r.`order`, r.`seriesTitle`,
r.`number`) IS UNIQUE" |
| "rel_exist_param" | "RELATIONSHIP_PROPERTY_EXISTENCE" | "CREATE CONSTRAINT
`rel_exist_param` FOR ()-[r:`WROTE`]-() REQUIRE (r.`published`) IS NOT NULL"
|
| "sequels" | "RELATIONSHIP_UNIQUENESS" | "CREATE CONSTRAINT `sequels`
FOR ()-[r:`SEQUEL_OF`]-() REQUIRE (r.`order`) IS UNIQUE"
|
| "wrote_year" | "RELATIONSHIP_PROPERTY_EXISTENCE" | "CREATE CONSTRAINT `wrote_year`
FOR ()-[r:`WROTE`]-() REQUIRE (r.`year`) IS NOT NULL"
|
+----------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------
---+

Result columns for listing constraints


Listing constraints output

670
Column Description Type

id The id of the constraint. Default output INTEGER

name Name of the constraint (explicitly set by the user or STRING

automatically assigned). Default output

type The ConstraintType of this constraint (UNIQUENESS STRING

(node uniqueness), RELATIONSHIP_UNIQUENESS,


NODE_PROPERTY_EXISTENCE,
RELATIONSHIP_PROPERTY_EXISTENCE,
NODE_PROPERTY_TYPE, RELATIONSHIP_PROPERTY_TYPE,
NODE_KEY, or RELATIONSHIP_KEY). Default output

UNIQUENESS and
RELATIONSHIP_UNIQUENESS will be
updated to say
 NODE_PROPERTY_UNIQUENESS and
RELATIONSHIP_PROPERTY_UNIQUENE
SS respectively in a future version
of Neo4j.

entityType Type of entities this constraint represents (NODE or STRING

RELATIONSHIP). Default output

labelsOrTypes The labels or relationship types of this constraint. LIST<STRING>

The list returned will only include a single value (the


name of the constrained node label or relationship
type). Default output

properties The properties of this constraint. Default output LIST<STRING>

ownedIndex The name of the index associated with the STRING

constraint or null, in case no index is associated


with it. Default output

propertyType The property type the property is restricted to for STRING

property type constraints, or null for the other


constraints. Default output New

options The options passed to CREATE command, for the MAP

index associated to the constraint, or null if no


index is associated with the constraint.

671
Column Description Type

createStatement Statement used to create the constraint. STRING

DROP CONSTRAINT
Constraints are dropped using the DROP CONSTRAINT command. For the full command syntax to drop
constraints, see Syntax → DROP CONSTRAINT.

 Dropping a constraint requires the DROP CONSTRAINT privilege.

Drop a constraint by name


A constraint can be dropped using the name with the DROP CONSTRAINT constraint_name command. It is
the same command for all constraint types. The name of the constraint can be found using the SHOW
CONSTRAINTS command, given in the output column name.

Example 366. Drop a constraint by name

Drop the constraint book_isbn

DROP CONSTRAINT book_isbn

Result

Removed 1 constraint.

Drop a constraint with a parameter Label—new 5.16


Constraints can be dropped with a parameterized name.

Example 367. Drop a constraint using a parameter

Parameters

{
"name": "actor_fullname"
}

Drop a constraint with a parameterized name

DROP CONSTRAINT $name

Result

Removed 1 constraint.

672
Drop a non-existing constraint
If it is uncertain if any constraint with a given name exists and you want to drop it if it does but not get an
error should it not, use IF EXISTS. This will ensure that no error is thrown. As of Neo4j 5.17, an
informational notification is returned stating that the constraint does not exist.

Example 368. Drop a non-existing constraint

Drop the non-existing constraint missing_constraint_name

DROP CONSTRAINT missing_constraint_name IF EXISTS

Result

(no changes, no records)

Notification

`DROP CONSTRAINT missing_constraint_name IF EXISTS` has no effect. `missing_constraint_name` does not


exist.

Syntax
This page contains the syntax for creating, listing, and dropping the constraints available in Neo4j.

More details about the syntax can be found in the Operations Manual → Cypher syntax for administration
commands.

CREATE CONSTRAINT
Constraints are created with the CREATE CONSTRAINT command. When creating a constraint, it is
recommended to provide a constraint name. This name must be unique among both indexes and
constraints. If a name is not explicitly given, a unique name will be auto-generated.

 Creating a constraint requires the CREATE CONSTRAINT privilege.

The CREATE CONSTRAINT command is optionally idempotent. This means its default behavior is to throw an
error if an attempt is made to create the same constraint twice. With the IF NOT EXISTS flag, no error is
thrown and nothing happens should a constraint with the same name or same schema and constraint type
already exist. It may still throw an error if conflicting data, indexes, or constraints exist. Examples of this
are nodes with missing properties, indexes with the same name, or constraints with same schema but a
different conflicting constraint type. As of Neo4j 5.17, an informational notification is returned in case
nothing happens showing the existing constraint which blocks the creation.

For constraints that are backed by an index, the index provider for the backing index can be specified using
the OPTIONS clause. Only one valid value exists for the index provider, range-1.0, which is the default value.
There is no supported index configuration for range indexes.

673
Create property uniqueness constraints
Syntax for creating a node property uniqueness constraint on a single property

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR (n:LabelName)
REQUIRE n.propertyName IS [NODE] UNIQUE
[OPTIONS "{" option: value[, ...] "}"]

Syntax for creating a composite node property uniqueness constraint on multiple properties

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR (n:LabelName)
REQUIRE (n.propertyName_1, ..., n.propertyName_n) IS [NODE] UNIQUE
[OPTIONS "{" option: value[, ...] "}"]

Syntax for creating a relationship property uniqueness constraint on a single property New

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR ()-"["r:RELATIONSHIP_TYPE"]"-()
REQUIRE r.propertyName IS [REL[ATIONSHIP]] UNIQUE
[OPTIONS "{" option: value[, ...] "}"]

Syntax for creating a composite relationship property uniqueness constraint on multiple properties New

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR ()-"["r:RELATIONSHIP_TYPE"]"-()
REQUIRE (r.propertyName_1, ..., r.propertyName_n) IS [REL[ATIONSHIP]] UNIQUE
[OPTIONS "{" option: value[, ...] "}"]

An index provider can be specified using the OPTIONS clause.

For examples on how to create property uniqueness constraints, see Create, show, and drop constraints →
Create property uniqueness constraint. Property uniqueness constraints are index-backed.

Create property existence constraints Label—enterprise edition


Syntax for creating a node property existence constraint

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR (n:LabelName)
REQUIRE n.propertyName IS NOT NULL
[OPTIONS "{" "}"]

Syntax for creating a relationship property existence constraint

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR ()-"["r:RELATIONSHIP_TYPE"]"-()
REQUIRE r.propertyName IS NOT NULL
[OPTIONS "{" "}"]

There are no supported OPTIONS values for property existence constraints, but an empty options map is
allowed for consistency.

For examples on how to create property existence constraints, see Create, show, and drop constraints →
Create property existence constraints.

674
Create property type constraints label:label—enterprise-edition label—new-5.9[]
Syntax for creating a node property type constraint

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR (n:LabelName)
REQUIRE n.propertyName {[IS] :: | IS TYPED} <TYPE>
[OPTIONS "{" "}"]

Syntax for creating a relationship property type constraint

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR ()-"["r:RELATIONSHIP_TYPE"]"-()
REQUIRE r.propertyName {[IS] :: | IS TYPED} <TYPE>
[OPTIONS "{" "}"]

The three variations of the expression, IS ::, ::, and IS TYPED are syntactic synonyms for the same
expression. The preferred syntax is the IS :: variant.

Where <TYPE> is one of the following property types:

• BOOLEAN

• STRING

• INTEGER

• FLOAT

• DATE

• LOCAL TIME

• ZONED TIME

• LOCAL DATETIME

• ZONED DATETIME

• DURATION

• POINT

• LIST<BOOLEAN NOT NULL> New

• LIST<STRING NOT NULL> New

• LIST<INTEGER NOT NULL> New

• LIST<FLOAT NOT NULL> New

• LIST<DATE NOT NULL> New

• LIST<LOCAL TIME NOT NULL> New

• LIST<ZONED TIME NOT NULL> New

• LIST<LOCAL DATETIME NOT NULL> New

• LIST<ZONED DATETIME NOT NULL> New

• LIST<DURATION NOT NULL> New

• LIST<POINT NOT NULL> New

675
• Any closed dynamic union of the above types, e.g. INTEGER | FLOAT | STRING. New

Allowed syntax variations of these types are listed in Types and their synonyms.

There are no supported OPTIONS values for property type constraints, but an empty options map is allowed
for consistency.

For examples on how to create property type constraints, see Create, show, and drop constraints → Create
property type constraints.

Create key constraints Label—enterprise edition


Syntax for creating a node key constraint on a single property

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR (n:LabelName)
REQUIRE n.propertyName IS [NODE] KEY
[OPTIONS "{" option: value[, ...] "}"]

Syntax for creating a composite node key constraint on multiple properties

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR (n:LabelName)
REQUIRE (n.propertyName_1, ..., n.propertyName_n) IS [NODE] KEY
[OPTIONS "{" option: value[, ...] "}"]

Syntax for creating a relationship key constraint on a single property New

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR ()-"["r:RELATIONSHIP_TYPE"]"-()
REQUIRE r.propertyName IS [REL[ATIONSHIP]] KEY
[OPTIONS "{" option: value[, ...] "}"]

Syntax for creating a composite relationship key constraint on multiple properties New

CREATE CONSTRAINT [constraint_name] [IF NOT EXISTS]


FOR ()-"["r:RELATIONSHIP_TYPE"]"-()
REQUIRE (r.propertyName_1, ..., r.propertyName_n) IS [REL[ATIONSHIP]] KEY
[OPTIONS "{" option: value[, ...] "}"]

An index provider can be specified using the OPTIONS clause.

For examples on how to create key constraints, see Create, show, and drop constraints → Create key
constraints. Key constraints are index-backed.

SHOW CONSTRAINTS
To list all constraints with the default output columns, use SHOW CONSTRAINTS. If all columns are required,
use SHOW CONSTRAINTS YIELD *. If only specific columns are required, use SHOW CONSTRAINTS YIELD
field[, …]. The SHOW CONSTRAINTS clause can also be filtered using the WHERE clause.

 Listing constraints requires the SHOW CONSTRAINTS privilege.

676
Syntax to list constraints with default return columns

SHOW [
ALL
|NODE UNIQUE[NESS]
|REL[ATIONSHIP] UNIQUE[NESS]
|UNIQUE[NESS]
|NODE [PROPERTY] EXIST[ENCE]
|REL[ATIONSHIP] [PROPERTY] EXIST[ENCE]
|[PROPERTY] EXIST[ENCE]
|NODE PROPERTY TYPE
|REL[ATIONSHIP] PROPERTY TYPE
|PROPERTY TYPE
|NODE KEY
|REL[ATIONSHIP] KEY
|KEY
] CONSTRAINT[S]
[WHERE expression]

Syntax for listing constraints with full return columns

SHOW [
ALL
|NODE UNIQUE[NESS]
|REL[ATIONSHIP] UNIQUE[NESS]
|UNIQUE[NESS]
|NODE [PROPERTY] EXIST[ENCE]
|REL[ATIONSHIP] [PROPERTY] EXIST[ENCE]
|[PROPERTY] EXIST[ENCE]
|NODE PROPERTY TYPE
|REL[ATIONSHIP] PROPERTY TYPE
|PROPERTY TYPE
|NODE KEY
|REL[ATIONSHIP] KEY
|KEY
] CONSTRAINT[S]
YIELD { * | field[, ...] } [ORDER BY field[, ...]] [SKIP n] [LIMIT n]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP n] [LIMIT n]]

The type filtering keywords filters the returned constraints on constraint type:

Type filters

Filter Description

ALL Returns all constraints, no filtering on constraint type. This is


the default if none is given.

NODE UNIQUE[NESS] Returns the node property uniqueness constraints. New

REL[ATIONSHIP] UNIQUE[NESS] Returns the relationship property uniqueness constraints. New

UNIQUE[NESS] Returns all property uniqueness constraints, for both nodes


and relationships. New

NODE [PROPERTY] EXIST[ENCE] Returns the node property existence constraints.

REL[ATIONSHIP] [PROPERTY] EXIST[ENCE] Returns the relationship property existence constraints.

677
Filter Description

[PROPERTY] EXIST[ENCE] Returns all property existence constraints, for both nodes and
relationships.

NODE PROPERTY TYPE Returns the node property type constraints. New

REL[ATIONSHIP] PROPERTY TYPE Returns the relationship property type constraints. New

PROPERTY TYPE Returns all property type constraints, for both nodes and
relationships. New

NODE KEY Returns the node key constraints.

REL[ATIONSHIP] KEY Returns the relationship key constraints. New

KEY Returns all node and relationship key constraints. New

For examples on how to list constraints, see Create, show, and drop constraints → SHOW CONSTRAINTS.
For full details of the result columns for the SHOW CONSTRAINTS command, see Create, show, and drop
constraints → Result columns for listing constraints.

DROP CONSTRAINT
Constraints are dropped using the DROP CONSTRAINT command. Dropping a constraint is done by
specifying the name of the constraint.

 Dropping a constraint requires the DROP CONSTRAINT privilege.

Syntax for dropping a constraint by name

DROP CONSTRAINT constraint_name [IF EXISTS]

This command is optionally idempotent. This means its default behavior is to throw an error if an attempt
is made to drop the same constraint twice. With the IF EXISTS flag, no error is thrown and nothing
happens should the constraint not exist. As of Neo4j 5.17, an informational notification is instead returned
detailing that the constraint does not exist.

For examples on how to drop constraints, see Create, show, and drop constraints → DROP CONSTRAINT.

678
Execution plans and query tuning
Cypher queries are executed according to a particular execution plan. The execution plan consists of a
binary tree of operators, with information about the step-by-step execution of a query, and it may differ
depending on which runtime the query uses. Apart from selecting a different runtime, there are numerous
other ways in which a query can be tuned.

More information about each of these topics can be found in the following sections:

• Understanding execution plans

• Operators

• Cypher runtimes

• Query tuning

Note on PROFILE and EXPLAIN


The queries in this section are often prepended with either PROFILE or EXPLAIN. Both produce an execution
plan, but there are important differences:

EXPLAIN
If you want to see the execution plan but not run the query, prepend your Cypher statement with
EXPLAIN. The statement will always return an empty result and make no changes to the database.

PROFILE
If you want to run the query and see which operators are doing most of the work, use PROFILE. This will
run your query and keep track of how many rows pass through each operator, and how much each
operator needs to interact with the storage layer to retrieve the necessary data. Note that profiling your
query uses more resources, so you should not profile unless you are actively working on a query.

Understanding execution plans


This page describes how to understand the execution plans produced by the Cypher planner. It begins by
explaining the lifecycle of a Cypher query, before giving a step-by-step breakdown of a particular query
and the execution plan it uses. It then explains the difference between lazy and eager query evaluation.

The lifecycle of a Cypher query


A Cypher query begins as a declarative query represented as a string, describing the graph pattern to
match in a database. After parsing, the query string goes through the query optimizer (also known as the
planner), which produces an imperative plan, known as the logical plan, to determine the most efficient
[13]
way of executing the query given the current state of the database. In the final phase, this logical plan is
turned into an executable physical plan, which actually runs the query against the database. Executing this
physical plan is the task of the Cypher runtime.

679
Example graph
To explain how to understand a Cypher execution plan, a graph based on the UK national rail network is
used. The data in the graph is taken from publically available datasets.

departs: 17:13 departs: 17:01

departs: 17:07 XT
NE
XT
NE NE

CALL
arrives: 17:19 XT
CALLS

departs: 17:11

S_AT
NEXT
_AT

CAL
CALLS_A

LS_A
CALLS

T
_AT
T

name: Wandsworth Road name: Peckham Rye

name: Denmark Hill


name: Clapham Junction

Station
name: Clapham High Street
T
CALLS_A

S_A
L
CAL

Stop
T

NEXT

arrives: 17:17 departs: 17:10

The graph contains two types of nodes: Stop and Station. Each Stop on a train service CALLS_AT one
Station, and has the properties arrives and departs that give the times the train is at the Station.
Following the NEXT relationship of a Stop will give the next Stop of a service.

To recreate the graph, run the following query against an empty Neo4j database:

680
Query

CREATE (pmr:Station {name: 'Peckham Rye'}),


(dmk:Station {name: 'Denmark Hill'}),
(clp:Station {name: 'Clapham High Street'}),
(wwr:Station {name: 'Wandsworth Road'}),
(clj:Station {name: 'Clapham Junction'}),
(s1:Stop {arrives: time('17:19'), departs: time('17:20')}),
(s2:Stop {arrives: time('17:12'), departs: time('17:13')}),
(s3:Stop {arrives: time('17:10'), departs: time('17:11')}),
(s4:Stop {arrives: time('17:06'), departs: time('17:07')}),
(s5:Stop {arrives: time('16:58'), departs: time('17:01')}),
(s6:Stop {arrives: time('17:17'), departs: time('17:20')}),
(s7:Stop {arrives: time('17:08'), departs: time('17:10')}),
(clj)<-[:CALLS_AT]-(s1), (wwr)<-[:CALLS_AT]-(s2),
(clp)<-[:CALLS_AT]-(s3), (dmk)<-[:CALLS_AT]-(s4),
(pmr)<-[:CALLS_AT]-(s5), (clj)<-[:CALLS_AT]-(s6),
(dmk)<-[:CALLS_AT]-(s7),
(s5)-[:NEXT {distance: 1.2}]->(s4),(s4)-[:NEXT {distance: 0.34}]->(s3),
(s3)-[:NEXT {distance: 0.76}]->(s2), (s2)-[:NEXT {distance: 0.3}]->(s1),
(s7)-[:NEXT {distance: 1.4}]->(s6)

The example query uses a quantified path pattern to count the number of possible path patterns between
the start Station, Denmark Hill, and the end Station, Clapham Junction:

Query

MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)


((:Stop)-[:NEXT]->(:Stop))+
(a:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })
RETURN count(*)

As can be seen from the graph, two such patterns exist (one with a service departing Denmark Hill at
17:07 which stops at the Stations Clapham High Street and Wandsworth Road, and one direct service
departing Denmark Hill at 17:10):

For the purposes of understanding Cypher execution plans, however, the query result is less interesting
than the planning that produces it.

Reading execution plans


The Cypher planner produces logical plans which describe how a particular query is going to be executed.
This execution plan is essentially a binary tree of operators. An operator is, in turn, a specialized execution
module that is responsible for some type of transformation to the data before passing it on to the next
operator, until the desired graph pattern has been matched. The execution plans produced by the planner

681
thus decide which operators will be used and in what order they will be applied to achieve the aim
declared in the original query.

In order to view the plan of a query, prepend the query with EXPLAIN - this will not run the query, but only
show the tree of operators used to find the desired result.

Query

EXPLAIN
MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)
((:Stop)-[:NEXT]->(:Stop))+
(a:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })
RETURN count(*)

[14]
This is the resulting execution plan :

682
+-------------------+----+------------------------------------------------------------------------
+----------------+---------------------+
| Operator | Id | Details |
Estimated Rows | Pipeline |
+-------------------+----+------------------------------------------------------------------------
+----------------+---------------------+
| +ProduceResults | 0 | `count(*)` |
1 | In Pipeline 3 |
| | +----+------------------------------------------------------------------------
+----------------+---------------------+
| +EagerAggregation | 1 | count(*) AS `count(*)` |
1 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Filter | 2 | NOT anon_1 = anon_5 AND anon_0.name = $autostring_0 AND anon_0:Station |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Expand(All) | 3 | (d)-[anon_1:CALLS_AT]->(anon_0) |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Filter | 4 | d:Stop |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +NullifyMetadata | 14 | |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Repeat(Trail) | 5 | (a) (...){1, *} (d) |
0 | Fused in Pipeline 2 |
| |\ +----+------------------------------------------------------------------------
+----------------+---------------------+
| | +Filter | 6 | isRepeatTrailUnique(anon_8) AND anon_7:Stop |
6 | |
| | | +----+------------------------------------------------------------------------
+----------------+ |
| | +Expand(All) | 7 | (anon_9)<-[anon_8:NEXT]-(anon_7) |
6 | |
| | | +----+------------------------------------------------------------------------
+----------------+ |
| | +Filter | 8 | anon_9:Stop |
11 | |
| | | +----+------------------------------------------------------------------------
+----------------+ |
| | +Argument | 9 | anon_9 |
13 | Fused in Pipeline 1 |
| | +----+------------------------------------------------------------------------
+----------------+---------------------+
| +Filter | 10 | a:Stop |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Expand(All) | 11 | (anon_6)<-[anon_5:CALLS_AT]-(a) |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Filter | 12 | anon_6.name = $autostring_1 |
1 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +NodeByLabelScan | 13 | anon_6:Station |
10 | Fused in Pipeline 0 |
+-------------------+----+------------------------------------------------------------------------
+----------------+---------------------+

The operators can be seen in the leftmost column of the results table. The most important thing to
remember when reading execution plans is that they are read from the bottom up. To follow the execution
of this query it is, therefore, necessary to start from the bottom or leaf operator, NodeByLabelScan (which
fetches all nodes with a specific label from the node label index) and move step-by-step up the operator
tree to see how the data in the graph is gradually refined until the final, root operator, ProduceResults,

683
generates readable results for the user.

To read more about the specific role played by operators used in this example, and many others, see the
section on Operators.

The id column specifies a unique ID assigned to each operator. There are no guarantees about the order of
the ids, although they will usually start with 0 at the root operator, and will increase until the leaf operator
is reached at the beginning of the operator tree.

The Details column in the middle of the execution plan describes what task is performed by each
operator. For example, the details column of the Repeat(Trail) operator in the middle of the execution plan
(id 5), specifies that the operator traverses a quantified path pattern without an upward limit.

Finally, the Estimated Rows column details the number of rows that are expected to be produced by each
operator. This estimate is an approximate number based on the available statistical information and the
[15]
planner uses it to choose a suitable execution plan.

For details about how the different Cypher runtimes changes a particular execution plan, see Runtime
concepts.

Lazy and eager query evaluation


In general, query evaluation is lazy. This means that most operators pipe their output rows to their parent
operators as soon as they are produced. In other words, a child operator may not be fully exhausted before
the parent operator starts consuming the input rows produced by the child.

However, some operators, such as those used for aggregation and sorting, need to aggregate all their
rows before they can produce output. These operators are called eager operators (see the
EagerAggregation operator in the above table (id 1) for an example). Such operators need to complete
execution in its entirety before any rows are sent to their parents as input, and are sometimes required to
enforce correct Cypher semantics. For more information about the row-by-row processing of Cypher
queries, see the section on Clause composition.

Operators
This page provides an overview of the available operators in Cypher. An operator is a specialized
execution module that is responsible for some type of transformation to the data in a query execution plan
before passing it on to the next operator, until the desired graph pattern has been matched.

For information about how to understand execution plans (and the role operators play in them), see
Understanding execution plans.

For more information about each operator, see Operators in detail.

Summary of operators
This table comprises all the execution plan operators ordered lexicographically.

• Leaf operators, in most cases, locate the starting nodes and relationships required in order to execute

684
the query.

• Updating operators are used in queries that update the graph.

• Eager operators accumulate all their rows before piping them to the next operator.

Name Description Leaf? Updating? Considerations

AllNodesScan Reads all nodes from the node Yes


store.

Anti Tests for the absence of a


pattern.

AntiSemiApply Performs a nested loop. Tests for


the absence of a pattern
predicate.

Apply Performs a nested loop. Yields


rows from both the left-hand
and right-hand side operators.

Argument Indicates the variable to be used Yes


as an argument to the right-hand
side of an Apply operator.

ArgumentTracker Used to ensure row-by-row Yes


semantics. Restricts the Cypher
runtime to not batch operations
in larger chunks.

AssertSameNode Ensures that no node property


uniqueness constraints are
violated.

AssertSameRelationship Ensures that no relationship New


property uniqueness constraints
are violated.

AssertingMultiNodeIndexSeek Used to ensure that no property Yes


uniqueness constraints are
violated.

CacheProperties Reads node or relationship


properties and caches them.

685
Name Description Leaf? Updating? Considerations

CartesianProduct Produces a cartesian product of


the inputs from the left-hand and
right-hand operators.

Create Creates nodes and relationships. Yes

CreateIndex Creates an index for either nodes Yes


or relationships.

CreateConstraint Creates a constraint for either Yes


nodes or relationships.

Delete Deletes a node or relationship. Yes

DetachDelete Deletes a node and its Yes


relationships.

DirectedAllRelationshipsScan Fetches all relationships and Yes


their start and end nodes in the
database.

DirectedRelationshipByElementI Reads one or more relationships Yes


dSeek by element id (specified via the
function elementId()) from the
relationship store and produces
the relationship as well as the
source and target node of the
relationship.

DirectedRelationshipByIdSeek Reads one or more relationships Yes


by id (specified via the function
Id()) from the relationship store,
and produces the relationship as
well as the source and target
node of the relationship.

DirectedRelationshipIndexContai Examines all values stored in an Yes


nsScan index, searching for entries
containing a specific STRING; for
example, in queries including
CONTAINS.

686
Name Description Leaf? Updating? Considerations

DirectedRelationshipIndexEnds Examines all values stored in an Yes


WithScan index, searching for entries
ending in a specific STRING; for
example, in queries containing
ENDS WITH.

DirectedRelationshipIndexScan Examines all values stored in an Yes


index, returning all relationships
and their start and end nodes
with a particular relationship
type and a specified property.

DirectedRelationshipIndexSeek Finds relationships and their Yes


start and end nodes using an
index seek.

DirectedRelationshipIndexSeekB Finds relationships and their Yes


yRange start and end nodes using an
index seek where the value of
the property matches a given
prefix STRING.

DirectedRelationshipTypeScan Fetches all relationships and Yes


their start and end nodes with a
specific type from the
relationship type index.

DirectedUnionRelationshipTypes Fetches all relationships and


Scan their start and end nodes with at
least one of the provided types
from the relationship type index.

Distinct Drops duplicate rows from the Eager


incoming stream of rows.

DoNothingIfExists(CONSTRAINT Checks if a constraint already Yes


) exists, if it does then it stops the
execution, if not it continues.

DoNothingIfExists(INDEX) Checks if an index already exists, Yes


if it does then it stops the
execution, if not it continues.

687
Name Description Leaf? Updating? Considerations

DropConstraint Drops a constraint using its Yes Yes


name.

DropIndex Drops an index using its name. Yes Yes

Eager For isolation purposes, Eager Eager


ensures that operations affecting
subsequent operations are
executed fully for the whole
dataset before continuing
execution.

EagerAggregation Evaluates a grouping expression. Eager

EmptyResult Eagerly loads all incoming data Eager


and discards it.

EmptyRow Returns a single row with no Yes


columns.

ExhaustiveLimit The ExhaustiveLimit operator is


similar to the Limitoperator, but
always exhausts the input. Used
when combining LIMIT and
updates.

Expand(All) Traverses incoming or outgoing


relationships from a given node.

Expand(Into) Finds all relationships between


two nodes.

Filter Filters each row coming from the


child operator, only passing
through rows that evaluate the
predicates to true.

Foreach Performs a nested loop. Yields


rows from the left-hand operator
and discards rows from the
right-hand operator.

688
Name Description Leaf? Updating? Considerations

IntersectionNodeByLabelsScan Fetches all nodes that have all of Yes New


the provided labels from the
node label index.

LetAntiSemiApply Performs a nested loop. Tests for


the absence of a pattern
predicate in queries containing
multiple pattern predicates.

LetSelectOrAntiSemiApply Performs a nested loop. Tests for


the absence of a pattern
predicate that is combined with
other predicates.

LetSelectOrSemiApply Performs a nested loop. Tests for


the presence of a pattern
predicate that is combined with
other predicates.

LetSemiApply Performs a nested loop. Tests for


the presence of a pattern
predicate in queries containing
multiple pattern predicates.

Limit Returns the first n rows from the


incoming input.

LoadCSV Loads data from a CSV source Yes


into the query.

LockingMerge Similar to the Merge operator but


will lock the start and end node
when creating a relationship if
necessary.

Merge The Merge operator will either


read or create nodes and/or
relationships.

MultiNodeIndexSeek Finds nodes using multiple index Yes


seeks.

689
Name Description Leaf? Updating? Considerations

NodeByElementIdSeek Reads one or more nodes by id Yes New


from the node store, specified via
the function elementId().

NodeByIdSeek Reads one or more nodes by id Yes


from the node store, specified via
the function id().

NodeByLabelScan Fetches all nodes with a specific Yes


label from the node label index.

NodeCountFromCountStore Uses the count store to answer Yes


questions about node counts.

NodeHashJoin Executes a hash join on node ID. Eager

NodeIndexContainsScan Examines all values stored in an Yes


index, searching for entries
containing a specific STRING.

NodeIndexEndsWithScan Examines all values stored in an Yes


index, searching for entries
ending in a specific STRING.

NodeIndexScan Examines all values stored in an Yes


index, returning all nodes with a
particular label with a specified
property.

NodeIndexSeek Finds nodes using an index seek. Yes

NodeIndexSeekByRange Finds nodes using an index seek Yes


where the value of the property
matches the given prefix STRING.

NodeLeftOuterHashJoin Executes a left outer hash join. Eager

NodeRightOuterHashJoin Executes a right outer hash join. Eager

NodeUniqueIndexSeek Finds nodes using an index seek Yes


within a unique index.

690
Name Description Leaf? Updating? Considerations

NodeUniqueIndexSeekByRange Finds nodes using an index seek Yes


within a unique index where the
value of the property matches
the given prefix STRING.

NullifyMetadata responsible for cleaning up the New


state produced by
Repeat(Trail). It is only planned
directly after Repeat(Trail).

Optional Yields a single row with all


columns set to null if no data is
returned by its source.

OptionalExpand(All) Traverses relationships from a


given node, producing a single
row with the relationship and
end node set to null if the
predicates are not fulfilled.

OptionalExpand(Into) Traverses all relationships


between two nodes, producing a
single row with the relationship
and end node set to null if no
matching relationships are found
(the start node is the node with
the smallest degree).

OrderedAggregation Similar to the EagerAggregation


operator but relies on the
ordering of incoming rows. It is
not eager.

OrderedDistinct Similar to the DISTINCT operator


but relies on the ordering of
incoming rows.

PartialSort Sorts a row by multiple columns


if there is already an ordering.

691
Name Description Leaf? Updating? Considerations

PartialTop Returns the first n rows sorted


by multiple columns if there is
already an ordering.

PartitionedAllNodesScan Used by the parallel runtime to Yes New


read all nodes from the node
store.

PartitionedDirectedAllRelationsh Used by the parallel runtime to Yes New


ipsScan fetch all relationships and their
start and end nodes from the
database.

PartitionedDirectedRelationshipI Used by the parallel runtime to Yes New


ndexScan examine all values stored in an
index. It returns all relationships
with a particular type and a
specified property, along with
their start and end nodes.

PartitionedDirectedRelationshipI Finds relationships and their Yes New


ndexSeek start and end nodes using a
parallel index seek.

PartitionedDirectedRelationshipI Finds relationships using a Yes New


ndexSeekByRange parallel index seek where the
value of the of the specified
relationship type property is
within a given range. It also finds
the start and end nodes of those
relationships.

PartitionedDirectedRelationshipT Fetches all relationships with a Yes New


ypesScan specific type from the
relationship type index using a
parallel scan. It also fetches the
start and end nodes of those
relationships.

692
Name Description Leaf? Updating? Considerations

PartitionedDirectedUnionRelatio Fetches all relationships with at New


nshipTypesScan least one of the provided types
from the relationship type index
using a parallel scan. It also
fetches the start and end nodes
of those relationships.

PartitionedNodeByLabelScan Used by the parallel runtime to Yes New


fetch all nodes with a specific
label from the node label index.

PartitionedNodeIndexScan Used by the parallel runtime to Yes New


examine all values stored in an
index, returning all nodes with a
particular label and a specified
property.

PartitionedNodeIndexSeek Used by the parallel runtime to Yes New


find nodes using an index seek.

PartitionedNodeIndexSeekByRa Finds nodes using a parallel Yes New


nge index seek where the value of
the specified property is within a
given range.

PartitionedSubtractionNodeByLa Used by the parallel runtime to Yes New


belsScan fetch all nodes that have all of
the first set of provided labels
and none of the second provided
set of labels from the node label
index.

PartitionedUndirectedAllRelation Used by the parallel runtime to Yes New


shipsScan fetch all relationships and their
start and end nodes from the
database.

693
Name Description Leaf? Updating? Considerations

PartitionedUndirectedRelationshi Used by the parallel runtime to Yes New


pIndexScan examine all values stored in an
index, returning all relationships
with a particular relationship
type and a specified property. It
also returns the start and end
nodes of those relationships.

PartitionedUndirectedRelationshi Finds relationships and their Yes New


pIndexSeek start and end nodes using a
parallel index seek.

PartitionedUndirectedRelationshi Finds relationships using a Yes New


pIndexSeekByRange parallel index seek where the
value of the specified
relationship property type is
within a given range. It also finds
the start and end nodes of those
relationships.

PartitionedUndirectedRelationshi Used by the parallel runtime to Yes New


pTypeScan fetch all relationships with a
specific type from the
relationship type index. It also
fetches the start and end nodes
of those relationships.

PartitionedUndirectedUnionRelat Used by the parallel runtime to Yes New


ionshipTypesScan fetch all relationships with at
least one of the provided types
from the relationship type index.
It also fetches the start and end
nodes of those relationships.

PartitionedUnionNodeByLabelsS Used by the parallel runtime to Yes New


can fetch all nodes that have at least
one of the provided labels from
the node label index.

PartitionedUnwind Used by the parallel runtime to New


return one row per item in a list.

694
Name Description Leaf? Updating? Considerations

ProcedureCall Calls a procedure.

ProduceResults Prepares the result so that it is


consumable by the user.

ProjectEndpoints Projects the start and end node


of a relationship.

Projection Evaluates a set of expressions,


producing a row with the results
thereof.

RelationshipCountFromCountSto Uses the count store to answer Yes


re questions about relationship
counts.

Repeat(Trail) Solves quantified path patterns. New

RemoveLabels Deletes labels from a node. Yes

RollUpApply Performs a nested loop.


Executes a pattern expression or
pattern comprehension.

SelectOrAntiSemiApply Performs a nested loop. Tests for


the absence of a pattern
predicate if an expression
predicate evaluates to false.

SelectOrSemiApply Performs a nested loop. Tests for


the presence of a pattern
predicate if an expression
predicate evaluates to false.

SemiApply Performs a nested loop. Tests for


the presence of a pattern
predicate.

SetLabels Sets labels on a node. Yes

695
Name Description Leaf? Updating? Considerations

SetNodePropertiesFromMap Sets properties from a map on a Yes


node.

SetProperty Sets a property on a node or Yes


relationship.

SetProperties Used when setting multiple Yes


properties on a node or
relationship.

SetRelationshipPropertiesFromM Sets properties from a map on a Yes


ap relationship.

ShortestPath Finds one or all shortest paths


between two previously matches
node variables.

ShowConstraints Lists the available constraints. Yes

ShowFunctions Lists the available functions. Yes

ShowIndexes Lists the available indexes. Yes

ShowProcedures Lists the available procedures. Yes

ShowSettings Lists the available configuration Yes


settings.

ShowTransactions Lists the available transactions Yes


on the current server.

Skip Skips n rows from the incoming


rows.

Sort Sorts rows by a provided key. Eager

StatefulShortestPath(All) Finds shortest paths from a New


previously matched node
variable to an endpoint that was
not previously matched.

696
Name Description Leaf? Updating? Considerations

StatefulShortestPath(Into) Finds shortest paths between New


two previously matched node
variables. It uses a bidirectional
breadth-first search (BFS)
algorithm, which performs two
BFS invocations at the same
time, one from the left boundary
node and one from the right
boundary node.

SubqueryForeach Works like the Foreachoperator


but it is only used for executing
subqueries.

SubtractionNodeByLabelsScan Fetches all nodes that have all of Yes New


the first set of provided labels
and none of the second provided
set of labels from the node label
index.

TerminateTransactions Terminate transactions with the Yes


given IDs.

Top Returns the first 'n' rows sorted Eager


by a provided key.

TransactionApply Works like the Apply operator


but will commit the current
transaction after a specified
number of rows.

TransactionForeach Works like the Foreach operator


but will commit the current
transaction after a specified
number of rows.

TriadicBuild Used in conjunction with


TriadicFilter to solve triangular
queries.

697
Name Description Leaf? Updating? Considerations

TriadicFilter Used in conjunction with


TriadicBuild to solve triangular
queries.

TriadicSelection Solves triangular queries, such


as the very common 'find my
friends-of-friends that are not
already my friends'.

UndirectedAllRelationshipsScan Fetches all relationships and Yes


their start and end nodes in the
database.

UndirectedRelationshipByEleme Reads one or more relationships Yes


ntIdSeek by element id (specified via the
function ElementId()) from the
relationship store. As the
direction is unspecified, two
rows are produced for each
relationship as a result of
alternating the combination of
the start and end node.

UndirectedRelationshipByIdSeek Reads one or more relationships Yes


by id (specified via the function
Id()) from the relationship store.
As the direction is unspecified,
two rows are produced for each
relationship as a result of
alternating the combination of
the start and end node.

UndirectedRelationshipIndexCon Examines all values stored in an Yes


tainsScan index, searching for entries
containing a specific STRING; for
example, in queries including
CONTAINS.

UndirectedRelationshipIndexEnd Examines all values stored in an Yes


sWithScan index, searching for entries
ending in a specific STRING; for
example, in queries containing
ENDS WITH.

698
Name Description Leaf? Updating? Considerations

UndirectedRelationshipIndexSca Examines all values stored in an Yes


n index, returning all relationships
and their start and end nodes
with a particular relationship
type and a specified property.

UndirectedRelationshipIndexSee Finds relationships and their Yes


k start and end nodes using an
index seek.

UndirectedRelationshipIndexSee Finds relationships and their Yes


kByRange start and end nodes using an
index seek where the value of
the property matches a given
prefix STRING.

UndirectedRelationshipTypeSca Fetches all relationships and Yes


n their start and end nodes with a
specific type from the
relationship type index.

UndirectedUnionRelationshipTyp Fetches all relationships and Yes


esScan their start and end nodes with at
least one of the provided types
from the relationship type index.

Union Concatenates the results from


the right-hand operator with the
results from the left-hand
operator.

UnionNodeByLabelsScan Fetches all nodes that have at Yes


least one of the provided labels
from the node label index.

Unwind Returns one row per item in a


list.

ValueHashJoin Executes a hash join on arbitrary Eager


values.

699
Name Description Leaf? Updating? Considerations

VarLengthExpand(All) Traverses variable-length


relationships from a given node.

VarLengthExpand(Into) Finds all variable-length


relationships between two
nodes.

VarLengthExpand(Pruning) Traverses variable-length


relationships from a given node
and only returns unique end
nodes.

VarLengthExpand(Pruning,BFS) Traverses variable-length


relationships from a given node
and only returns unique end
nodes.

Database hits
Each operator will send a request to the storage engine to do work such as retrieving or updating data. A
database hit (DBHits) is an abstract unit of this storage engine work.

These are all the actions that trigger one or more database hits:

• Create actions
◦ Create a node.

◦ Create a relationship.

◦ Create a new node label.

◦ Create a new relationship type.

◦ Create a new ID for property keys with the same name.

• Delete actions
◦ Delete a node.

◦ Delete a relationship.

• Update actions
◦ Set one or more labels on a node.

◦ Remove one or more labels from a node.

• Node-specific actions
◦ Get a node by its ID.

◦ Get the degree of a node.

700
◦ Determine whether a node is dense.

◦ Determine whether a label is set on a node.

◦ Get the labels of a node.

◦ Get a property of a node.

◦ Get an existing node label.

◦ Get the name of a label by its ID, or its ID by its name.

• Relationship-specific actions
◦ Get a relationship by its ID.

◦ Get a property of a relationship.

◦ Get an existing relationship type.

◦ Get a relationship type name by its ID, or its ID by its name.

• General actions
◦ Get the name of a property key by its ID, or its ID by the key name.

◦ Find a node or relationship through an index seek or index scan.

◦ Find a path in a variable-length expand.

◦ Find a shortest path.

◦ Ask the count store for a value.

• Schema actions
◦ Add an index.

◦ Drop an index.

◦ Get the reference of an index.

◦ Create a constraint.

◦ Drop a constraint.

• Call a procedure.

• Call a user-defined function.

Operators in detail
This page contains details and an example query for all operators available in Cypher. The operators are
grouped by the similarity of their characteristics.

Certain operators are only used by a subset of the runtimes that Cypher can choose from. If that is the
case, the example queries will be prefixed with an option to choose one of these runtimes.

All Nodes Scan


The AllNodesScan operator reads all nodes from the node store. The variable that will contain the nodes is
seen in the arguments. Any query using this operator is likely to encounter performance problems on a

701
non-trivial database.

Example 369. AllNodesScan

Query

PROFILE
MATCH (n)
RETURN n

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | n | 35 | 35 | 0 | |
| | |
| | +---------+----------------+------+---------+----------------+
| | |
| +AllNodesScan | n | 35 | 35 | 36 | 120 |
3/0 | 0.354 | Fused in Pipeline 0 |
+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 36, total allocated memory: 184

Partitioned All Nodes Scan Label—new 5.17


The PartitionedAllNodesScan is a variant of the AllNodesScan operator used by the parallel runtime. It
allows the store to be partitioned into different segments where each segment can be scanned
independently in parallel.

702
Example 370. PartitionedAllNodesScan

Query

CYPHER runtime=parallel
PROFILE
MATCH (n)
RETURN n

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+--------------------------+----+---------+----------------+------+---------+------------------------
+-----------+---------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses
| Time (ms) | Pipeline |
+--------------------------+----+---------+----------------+------+---------+------------------------
+-----------+---------------+
| +ProduceResults | 0 | n | 35 | 35 | 71 | 2/0
| 6.470 | In Pipeline 1 |
| | +----+---------+----------------+------+---------+------------------------
+-----------+---------------+
| +PartitionedAllNodesScan | 1 | n | 35 | 35 | 35 | 3/0
| 1.491 | In Pipeline 0 |
+--------------------------+----+---------+----------------+------+---------+------------------------
+-----------+---------------+

Total database accesses: 106

Directed Relationship Index Scan


The DirectedRelationshipIndexScan operator examines all values stored in an index, returning all
relationships and their start and end nodes with a particular relationship type and a specified property.

703
Example 371. DirectedRelationshipIndexScan

Query

PROFILE
MATCH ()-[r: WORKS_IN]->()
WHERE r.title IS NOT NULL
RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------------------
+----------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+--------------------------------
+----------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | r
| 15 | 15 | 0 | | | |
|
| |
+----------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +DirectedRelationshipIndexScan | RANGE INDEX (anon_0)-[r:WORKS_IN(title)]->(anon_1) WHERE title IS
NOT NULL | 15 | 15 | 16 | 120 | 3/1 | 2.464 |
Fused in Pipeline 0 |
+--------------------------------
+----------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 16, total allocated memory: 184

Partitioned Directed Relationship Index Scan Label—new 5.17


The PartitionedDirectedRelationshipIndexScan is a variant of the DirectedRelationshipIndexScan
operator used by the parallel runtime. It allows the index to be partitioned into different segments where
each segment can be scanned independently in parallel.

704
Example 372. PartitionedDirectedRelationshipIndexScan

Query

CYPHER runtime=parallel
PROFILE
MATCH ()-[r: WORKS_IN]->()
WHERE r.title IS NOT NULL
RETURN r

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+-------------------------------------------+----
+----------------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------------------------------+----
+----------------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | r
| 15 | 15 | 70 | 1/0 | 2.865 | In Pipeline 1 |
| | +----
+----------------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +PartitionedDirectedRelationshipIndexScan | 1 | RANGE INDEX (anon_0)-[r:WORKS_IN(title)]->(anon_1)
WHERE title IS NOT NULL | 15 | 15 | 16 | 2/0 | 0.527 | In
Pipeline 0 |
+-------------------------------------------+----
+----------------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+

Total database accesses: 86

Undirected Relationship Index Scan


The UndirectedRelationshipIndexScan operator examines all values stored in an index, returning all
relationships and their start and end nodes with a particular relationship type and a specified property.

705
Example 373. UndirectedRelationshipIndexScan

Query

PROFILE
MATCH ()-[r: WORKS_IN]-()
WHERE r.title IS NOT NULL
RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------------------
+---------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+----------------------------------
+---------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | r
| 30 | 30 | 0 | | | |
|
| |
+---------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +UndirectedRelationshipIndexScan | RANGE INDEX (anon_0)-[r:WORKS_IN(title)]-(anon_1) WHERE title IS
NOT NULL | 30 | 30 | 16 | 120 | 3/1 | 1.266 |
Fused in Pipeline 0 |
+----------------------------------
+---------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 16, total allocated memory: 184

Partitioned Undirected Relationship Index Scan Label—new 5.17


The PartitionedUndirectedRelationshipIndexScan is a variant of the UndirectedRelationshipIndexScan
operator used by the parallel runtime. It allows the index to be partitioned into different segments where
each segment can be scanned independently in parallel.

706
Example 374. PartitionedUndirectedRelationshipIndexScan

Query

CYPHER runtime=parallel
PROFILE
MATCH ()-[r: WORKS_IN]-()
WHERE r.title IS NOT NULL
RETURN r

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+---------------------------------------------+----
+---------------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------------------------+----
+---------------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | r
| 30 | 30 | 140 | 1/0 | 3.088 | In Pipeline 1 |
| | +----
+---------------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +PartitionedUndirectedRelationshipIndexScan | 1 | RANGE INDEX (anon_0)-[r:WORKS_IN(title)]-
(anon_1) WHERE title IS NOT NULL | 30 | 30 | 16 | 2/0 |
0.572 | In Pipeline 0 |
+---------------------------------------------+----
+---------------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+

Total database accesses: 156

Directed Relationship Index Seek


The DirectedRelationshipIndexSeek operator finds relationships and their start and end nodes using an
index seek. The relationship variable and the index used are shown in the arguments of the operator.

707
Example 375. DirectedRelationshipIndexSeek

Query

PROFILE
MATCH (candidate)-[r:WORKS_IN]->()
WHERE r.title = 'chief architect'
RETURN candidate

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------------------
+-----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+--------------------------------
+-----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | candidate
| 2 | 1 | 0 | | | |
|
| |
+-----------------------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +DirectedRelationshipIndexSeek | RANGE INDEX (candidate)-[r:WORKS_IN(title)]->(anon_0) WHERE title
= $autostring_0 | 2 | 1 | 2 | 120 | 3/1 |
0.591 | Fused in Pipeline 0 |
+--------------------------------
+-----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 2, total allocated memory: 184

Partitioned Directed Relationship Index Seek Label—new 5.17


The PartitionedDirectedRelationshipIndexSeek is a variant of the the DirectedRelationshipIndexSeek
operator used by the parallel runtime. It allows the index to be partitioned into different segments where
each segment can be scanned independently in parallel.

708
Example 376. PartitionedDirectedRelationshipIndexSeek

Query

CYPHER runtime=parallel
PROFILE
MATCH (candidate)-[r:WORKS_IN]->()
WHERE r.title = 'chief architect'
RETURN candidate

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+-------------------------------------------+----
+-----------------------------------------------------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------------------------------+----
+-----------------------------------------------------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | candidate
| 2 | 1 | 2 | 2/0 | 0.284 | In Pipeline 1 |
| | +----
+-----------------------------------------------------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| +PartitionedDirectedRelationshipIndexSeek | 1 | RANGE INDEX (candidate)-[r:WORKS_IN(title)]-
>(anon_0) WHERE title = $autostring_0 | 2 | 1 | 2 | 2/0 |
0.148 | In Pipeline 0 |
+-------------------------------------------+----
+-----------------------------------------------------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+

Total database accesses: 4

Undirected Relationship Index Seek


The UndirectedRelationshipIndexSeek operator finds relationships and their start and end nodes using an
index seek. The relationship variable and the index used are shown in the arguments of the operator.

709
Example 377. UndirectedRelationshipIndexSeek

Query

PROFILE
MATCH (candidate)-[r:WORKS_IN]-()
WHERE r.title = 'chief architect'
RETURN candidate

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------------------
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+----------------------------------
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | candidate
| 4 | 2 | 0 | | | |
|
| |
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +UndirectedRelationshipIndexSeek | RANGE INDEX (candidate)-[r:WORKS_IN(title)]-(anon_0) WHERE title
= $autostring_0 | 4 | 2 | 2 | 120 | 3/1 |
0.791 | Fused in Pipeline 0 |
+----------------------------------
+----------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 2, total allocated memory: 184

Partitioned Undirected Relationship Index Seek


The PartitionedUndirectedRelationshipIndexSeek operator finds relationships and their start and end
nodes using an index seek. The relationship variable and the index used are shown in the arguments of the
operator.

710
Example 378. PartitionedUndirectedRelationshipIndexSeek

Query

CYPHER runtime=parallel
PROFILE
MATCH (candidate)-[r:WORKS_IN]-()
WHERE r.title = 'chief architect'
RETURN candidate

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+---------------------------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | candidate
| 4 | 2 | 4 | 2/0 | 0.333 | In Pipeline 1 |
| | +----
+----------------------------------------------------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| +PartitionedUndirectedRelationshipIndexSeek | 1 | RANGE INDEX (candidate)-[r:WORKS_IN(title)]-
(anon_0) WHERE title = $autostring_0 | 4 | 2 | 2 | 2/0 |
0.151 | In Pipeline 0 |
+---------------------------------------------+----
+----------------------------------------------------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+

Total database accesses: 6

Directed Relationship By Element Id Seek


The DirectedRelationshipByElementIdSeek operator reads one or more relationships by element id from
the relationship store (specified via the function elementId()) and produces the relationship as well as the
source and target node of the relationship.

711
Example 379. DirectedRelationshipByElementIdSeek

Query

PROFILE
MATCH (n1)-[r]->()
WHERE elementId(r) = 0
RETURN r, n1

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------------------------+----+----------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| Operator | Id | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+--------------------------------------+----+----------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +ProduceResults | 0 | r, n1 |
1 | 0 | 0 | 0 | 0/0 | 0.314 | |
| | +----+----------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
|
| +DirectedRelationshipByElementIdSeek | 1 | (n1)-[r]->(anon_0) WHERE elementId(r) = $autoint_0 |
1 | 0 | 0 | 248 | 0/0 | 2.337 | In Pipeline 0 |
+--------------------------------------+----+----------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+

Total database accesses: 0, total allocated memory: 312

Directed Relationship By Id Seek


The DirectedRelationshipByIdSeek operator reads one or more relationships by id from the relationship
store, and produces the relationship as well as the source and target node of the relationship.

712
Example 380. DirectedRelationshipByIdSeek

Query

PROFILE
MATCH (n1)-[r]->()
WHERE id(r) = 0
RETURN r, n1

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------------------+----+---------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------------------+----+---------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | r, n1 | 1 |
1 | 7 | 0 | | | |
| | +----+---------------------------------------------+----------------
+------+---------+----------------+ | | |
| +DirectedRelationshipByIdSeek | 1 | (n1)-[r]->(anon_0) WHERE id(r) = $autoint_0 | 1 |
1 | 1 | 248 | 3/0 | 0.483 | Fused in Pipeline 0 |
+-------------------------------+----+---------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 8, total allocated memory: 312

Undirected Relationship By Element Id Seek


The UndirectedRelationshipByElementIdSeek operator reads one or more relationships by element id from
the relationship store (specified via the function elementId()). As the direction is unspecified, two rows are
produced for each relationship as a result of alternating the combination of the start and end node.

713
Example 381. UndirectedRelationshipByElementIdSeek

Query

PROFILE
MATCH (n1)-[r]-()
WHERE elementId(r) = 1
RETURN r, n1

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------------------------+--------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details | Estimated Rows
| Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------------------+--------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | r, n1 | 2
| 2 | 0 | | | | |
| | +--------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +UndirectedRelationshipByElementIdSeek| (n1)-[r]-(anon_0) WHERE elementId(r) = $autoint_0 | 2
| 2 | 1 | 120 | 4/0 | 0.332 | Fused in Pipeline 0 |
+---------------------------------+--------------------------------------------+-----+---------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 1, total allocated memory: 184

Undirected Relationship By Id Seek


The UndirectedRelationshipByIdSeek operator reads one or more relationships by id from the relationship
store (specified via the function Id()). As the direction is unspecified, two rows are produced for each
relationship as a result of alternating the combination of the start and end node.

714
Example 382. UndirectedRelationshipByIdSeek

Query

PROFILE
MATCH (n1)-[r]-()
WHERE id(r) = 1
RETURN r, n1

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------------------+----+--------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows
| Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------------+----+--------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | r, n1 | 2
| 2 | 14 | 0 | | | |
| | +----+--------------------------------------------+----------------
+------+---------+----------------+ | | |
| +UndirectedRelationshipByIdSeek | 1 | (n1)-[r]-(anon_0) WHERE id(r) = $autoint_0 | 2
| 2 | 1 | 248 | 3/0 | 1.005 | Fused in Pipeline 0 |
+---------------------------------+----+--------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 15, total allocated memory: 312

Directed Relationship Index Contains Scan


The DirectedRelationshipIndexContainsScan operator examines all values stored in an index, searching
for entries containing a specific STRING; for example, in queries including CONTAINS. Although this is slower
than an index seek (since all entries need to be examined), it is still faster than the indirection resulting
from a type scan using DirectedRelationshipTypeScan, and a property store filter.

715
Example 383. DirectedRelationshipIndexContainsScan

Query

PROFILE
MATCH ()-[r: WORKS_IN]->()
WHERE r.title CONTAINS 'senior'
RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------------------------
+--------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+----------------------------------------
+--------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | r
| 0 | 4 | 0 | | | |
|
| |
+--------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +DirectedRelationshipIndexContainsScan | TEXT INDEX (anon_0)-[r:WORKS_IN(title)]->(anon_1) WHERE
title CONTAINS $autostring_0 | 0 | 4 | 5 | 120 |
3/0 | 1.051 | Fused in Pipeline 0 |
+----------------------------------------
+--------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 5, total allocated memory: 184

Undirected Relationship Index Contains Scan


The UndirectedRelationshipIndexContainsScan operator examines all values stored in an index, searching
for entries containing a specific STRING; for example, in queries including CONTAINS. Although this is slower
than an index seek (since all entries need to be examined), it is still faster than the indirection resulting
from a type scan using DirectedRelationshipTypeScan, and a property store filter.

716
Example 384. UndirectedRelationshipIndexContainsScan

Query

PROFILE
MATCH ()-[r: WORKS_IN]-()
WHERE r.title CONTAINS 'senior'
RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------------------------
+-------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+------------------------------------------
+-------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | r
| 0 | 8 | 0 | | | |
|
| |
+-------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +UndirectedRelationshipIndexContainsScan | TEXT INDEX (anon_0)-[r:WORKS_IN(title)]-(anon_1) WHERE
title CONTAINS $autostring_0 | 0 | 8 | 5 | 120 |
3/0 | 2.684 | Fused in Pipeline 0 |
+------------------------------------------
+-------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 5, total allocated memory: 184

Directed Relationship Index Ends With Scan


The DirectedRelationshipIndexEndsWithScan operator examines all values stored in an index, searching
for entries ending in a specific STRING; for example, in queries containing ENDS WITH. Although this is slower
than an index seek (since all entries need to be examined), it is still faster than the indirection resulting
from a label scan using NodeByLabelScan, and a property store filter.

717
Example 385. DirectedRelationshipIndexEndsWithScan

Query

PROFILE
MATCH ()-[r: WORKS_IN]->()
WHERE r.title ENDS WITH 'developer'
RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------------------------
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+----------------------------------------
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | r
| 0 | 8 | 0 | | | |
|
| |
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +DirectedRelationshipIndexEndsWithScan | TEXT INDEX (anon_0)-[r:WORKS_IN(title)]->(anon_1) WHERE
title ENDS WITH $autostring_0 | 0 | 8 | 9 | 120 |
3/0 | 1.887 | Fused in Pipeline 0 |
+----------------------------------------
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 9, total allocated memory: 184

Undirected Relationship Index Ends With Scan


The UndirectedRelationshipIndexEndsWithScan operator examines all values stored in an index, searching
for entries ending in a specific STRING; for example, in queries containing ENDS WITH. Although this is slower
than an index seek (since all entries need to be examined), it is still faster than the indirection resulting
from a label scan using NodeByLabelScan, and a property store filter.

718
Example 386. UndirectedRelationshipIndexEndsWithScan

Query

PROFILE
MATCH ()-[r: WORKS_IN]-()
WHERE r.title ENDS WITH 'developer'
RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------------------------
+--------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+------------------------------------------
+--------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | r
| 0 | 16 | 0 | | | |
|
| |
+--------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +UndirectedRelationshipIndexEndsWithScan | TEXT INDEX (anon_0)-[r:WORKS_IN(title)]-(anon_1) WHERE
title ENDS WITH $autostring_0 | 0 | 16 | 9 | 120 |
3/0 | 1.465 | Fused in Pipeline 0 |
+------------------------------------------
+--------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 9, total allocated memory: 184

Directed Relationship Index Seek By Range


The DirectedRelationshipIndexSeekByRange operator finds relationships and their start and end nodes
using an index seek where the value of the property matches a given prefix STRING.
DirectedRelationshipIndexSeekByRange can be used for STARTS WITH and comparison operators such as <,
>, <= and >=.

719
Example 387. DirectedRelationshipIndexSeekByRange

Query

PROFILE
MATCH (candidate: Person)-[r:WORKS_IN]->(location)
WHERE r.duration > 100
RETURN candidate

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------------------------
+----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+---------------------------------------
+----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | candidate
| 4 | 15 | 0 | | | |
|
| |
+----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Filter | candidate:Person
| 4 | 15 | 30 | | | |
|
| |
+----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +DirectedRelationshipIndexSeekByRange | RANGE INDEX (candidate)-[r:WORKS_IN(duration)]->(location)
WHERE duration > $autoint_0 | 4 | 15 | 16 | 120 |
4/1 | 0.703 | Fused in Pipeline 0 |
+---------------------------------------
+----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 46, total allocated memory: 184

Partitioned Directed Relationship Index Seek By Range Label—new 5.17


The PartitionedDirectedRelationshipIndexSeekByRange is a variant of the
DirectedRelationshipIndexSeekByRange operator used by the parallel runtime. It allows the index to be
partitioned into different segments where each segment can be scanned independently in parallel.

720
Example 388. PartitionedDirectedRelationshipIndexSeekByRange

Query

CYPHER runtime=parallel
PROFILE
MATCH (candidate: Person)-[r:WORKS_IN]->(location)
WHERE r.duration > 100
RETURN candidate

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+--------------------------------------------------+----
+----------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------------------------------------+----
+----------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | candidate
| 4 | 15 | 30 | 1/0 | 1.031 | In Pipeline 1 |
| | +----
+----------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+
| +Filter | 1 | candidate:Person
| 4 | 15 | 30 | | | |
| | +----
+----------------------------------------------------------------------------------------
+----------------+------+---------+ | | |
| +PartitionedDirectedRelationshipIndexSeekByRange | 2 | RANGE INDEX (candidate)-
[r:WORKS_IN(duration)]->(location) WHERE duration > $autoint_0 | 4 | 15 | 16 |
3/0 | 0.203 | Fused in Pipeline 0 |
+--------------------------------------------------+----
+----------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+

Total database accesses: 76

Undirected Relationship Index Seek By Range


The UndirectedRelationshipIndexSeekByRange operator finds relationships and their start and end nodes
using an index seek where the value of the property matches a given prefix STRING.
UndirectedRelationshipIndexSeekByRange can be used for STARTS WITH and comparison operators such as
<, >, <= and >=.

721
Example 389. UndirectedRelationshipIndexSeekByRange

Query

PROFILE
MATCH (candidate: Person)-[r:WORKS_IN]-(location)
WHERE r.duration > 100
RETURN candidate

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------------------------
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------------------------------
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | candidate
| 5 | 15 | 0 | | | |
|
| |
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Filter | candidate:Person
| 5 | 15 | 60 | | | |
|
| |
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +UndirectedRelationshipIndexSeekByRange | RANGE INDEX (candidate)-[r:WORKS_IN(duration)]-(location)
WHERE duration > $autoint_0 | 8 | 30 | 16 | 120 |
4/1 | 1.214 | Fused in Pipeline 0 |
+-----------------------------------------
+---------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 76, total allocated memory: 184

Partitioned Undirected Relationship Index Seek By Range Label—new 5.17


The PartitionedUndirectedRelationshipIndexSeekByRange is a variant of the
UndirectedRelationshipIndexSeekByRange operator used by the parallel runtime. It allows the store to be
partitioned into different segments where each segment can be scanned independently in parallel.

722
Example 390. PartitionedUndirectedRelationshipIndexSeekByRange

Query

CYPHER runtime=parallel
PROFILE
MATCH (candidate: Person)-[r:WORKS_IN]-(location)
WHERE r.duration > 100
RETURN candidate

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+----------------------------------------------------+----
+---------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+----------------------------------------------------+----
+---------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | candidate
| 5 | 15 | 30 | 1/0 | 0.918 | In Pipeline 1 |
| | +----
+---------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+
| +Filter | 1 | candidate:Person
| 5 | 15 | 60 | | | |
| | +----
+---------------------------------------------------------------------------------------
+----------------+------+---------+ | | |
| +PartitionedUndirectedRelationshipIndexSeekByRange | 2 | RANGE INDEX (candidate)-
[r:WORKS_IN(duration)]-(location) WHERE duration > $autoint_0 | 8 | 30 | 16 |
3/0 | 0.413 | Fused in Pipeline 0 |
+----------------------------------------------------+----
+---------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+

Total database accesses: 106

Union Node By Labels Scan


The UnionNodeByLabelsScan operator fetches all nodes that have at least one of the provided labels from
the node label index.

723
Query

PROFILE
MATCH (countryOrLocation:Country|Location)
RETURN countryOrLocation

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------+------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+-----------------------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits |
Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by | Pipeline |
+------------------------+------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+-----------------------+---------------------+
| +ProduceResults | countryOrLocation | 17 | 11 | 0 |
| | | | |
| | +------------------------------------+----------------+------+---------
+----------------+ | | | |
| +UnionNodeByLabelsScan | countryOrLocation:Country|Location | 17 | 11 | 13 |
120 | 3/1 | 0.660 | countryOrLocation ASC | Fused in Pipeline 0 |
+------------------------+------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+-----------------------+---------------------+

Total database accesses: 13, total allocated memory: 184

Partitioned Union Node By Labels Scan Label—new 5.17


The PartitionedUnionNodeByLabelsScan is a variant of the UnionNodeByLabelsScan operator used by the
parallel runtime. It allows the index to be partitioned into different segments where each segment can be
scanned independently in parallel.

724
Query

CYPHER runtime=parallel
PROFILE
MATCH (countryOrLocation:Country|Location)
RETURN countryOrLocation

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+-----------------------------------+----+------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| Operator | Id | Details | Estimated Rows | Rows
| DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------------------------+----+------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | countryOrLocation | 17 | 11
| 22 | 2/0 | 1.548 | In Pipeline 1 |
| | +----+------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+
| +PartitionedUnionNodeByLabelsScan | 1 | countryOrLocation:Country|Location | 17 | 11
| 13 | 2/0 | 1.976 | In Pipeline 0 |
+-----------------------------------+----+------------------------------------+----------------
+------+---------+------------------------+-----------+---------------+

Total database accesses: 35

Intersection Node By Labels Scan Label—new 5.5


The IntersectionNodeByLabelsScan operator fetches all nodes that have all of the provided labels from the
node label index.

725
Query

PROFILE
MATCH (countryAndLocation:Country&Location)
RETURN countryAndLocation

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------------------+----+-------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------------------+----+-------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | countryAndLocation | 10 | 0 |
0 | | | | |
| | +----+-------------------------------------+----------------+------
+---------+----------------+ | | |
| +IntersectionNodeByLabelsScan | 1 | countryAndLocation:Country&Location | 10 | 0 |
0 | 120 | 0/0 | 1.011 | Fused in Pipeline 0 |
+-------------------------------+----+-------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 13, total allocated memory: 184

Partitioned Intersection Node By Labels Scan Label—new 5.17


The PartitionedIntersectionNodeByLabelsScan is a variant of the IntersectionNodeByLabelsScan
operator used by the parallel runtime. It allows the index to be partitioned into different segments where
each segment can be scanned independently in parallel.

726
Query

CYPHER runtime=parallel
PROFILE
MATCH (countryAndLocation:Country&Location)
RETURN countryAndLocation

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

------------------------------------------+----+-------------------------------------+
----------------+------+---------+------------------------+-----------+---------------+
| Operator | Id | Details | Estimated
Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------------------------+----+-------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | countryAndLocation |
3 | 0 | 0 | 0/0 | 0.018 | In Pipeline 1 |
| | +----+-------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +PartitionedIntersectionNodeByLabelsScan | 1 | countryAndLocation:Country&Location |
3 | 0 | 13 | 2/0 | 0.770 | In Pipeline 0 |
+------------------------------------------+----+-------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+

Total database accesses: 13

Subtraction Node By Labels Scan Label—new 5.21


The SubtractionNodeByLabelsScan operator fetches all nodes that have all of the first set of provided
labels and none of the second provided set of labels from the node label index. In the example below,
SubtractionNodeByLabelsScan retrieves all nodes that have the Location label but do not have the Country
label.

727
Query

PROFILE
MATCH (locationNotCountry:Location&!Country)
RETURN locationNotCountry

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------------+----+--------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------------+----+--------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | locationNotCountry | 7 | 10 |
0 | 0 | | | |
| | +----+--------------------------------------+----------------+------
+---------+----------------+ | | |
| +SubtractionNodeByLabelsScan | 1 | locationNotCountry:Location&!Country | 7 | 10 |
13 | 248 | 2/0 | 3.081 | Fused in Pipeline 0 |
+------------------------------+----+--------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 13, total allocated memory: 312

Partitioned Subtraction Node By Labels Scan Label—new 5.21


The PartitionedSubtractionNodeByLabelsScan is a variant of the SubtractionNodeByLabelsScan operator
used by the parallel runtime. It allows the index to be partitioned into different segments where each
segment can be scanned independently in parallel.

728
Query

CYPHER runtime=parallel
PROFILE
MATCH (locationNotCountry:Location&!Country)
RETURN locationNotCountry

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+-----------------------------------------+----+--------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| Operator | Id | Details | Estimated
Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------------------------------+----+--------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +ProduceResults | 0 | locationNotCountry |
7 | 10 | 0 | 136 | 0/0 | 0.614 | In Pipeline 1 |
| | +----+--------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +PartitionedSubtractionNodeByLabelsScan | 1 | locationNotCountry:Location&!Country |
7 | 10 | 13 | 120 | 2/0 | 5.173 | In Pipeline 0 |
+-----------------------------------------+----+--------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+

Total database accesses: 13, total allocated memory: 262144

Directed All Relationships Scan


The DirectedAllRelationshipsScan operator fetches all relationships and their start and end nodes in the
database.

729
Query

PROFILE
MATCH ()-[r]->() RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------------------+------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------------------+------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | r | 28 | 28 | 0 |
| | | |
| | +------------------------+----------------+------+---------
+----------------+ | | |
| +DirectedAllRelationshipsScan | (anon_0)-[r]->(anon_1) | 28 | 28 | 28 |
120 | 3/0 | 0.502 | Fused in Pipeline 0 |
+-------------------------------+------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 28, total allocated memory: 184

Partitioned Directed All Relationships Scan Label—new 5.17


The PartitionedDirectedAllRelationshipsScan is a variant of the DirectedAllRelationshipsScan
operator used by the parallel runtime. It allows the store to be partitioned into different segments where
each segment can be scanned independently in parallel.

730
Query

CYPHER runtime=parallel
PROFILE
MATCH ()-[r]->() RETURN r

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

| Operator | Id | Details | Estimated Rows | Rows | DB


Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------------------------+----+------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | r | 28 | 28 |
83 | 2/0 | 3.872 | In Pipeline 1 |
| | +----+------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +PartitionedDirectedAllRelationshipsScan | 1 | (anon_0)-[r]->(anon_1) | 28 | 28 |
28 | 3/0 | 1.954 | In Pipeline 0 |
+------------------------------------------+----+------------------------+----------------+------
+---------+------------------------+-----------+---------------+

Total database accesses: 111

Undirected All Relationships Scan


The UndirectedAllRelationshipsScan operator fetches all relationships and their start and end nodes in
the database.

731
Query

PROFILE
MATCH ()-[r]-() RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------------------+-----------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------------+-----------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | r | 56 | 56 | 0 |
| | | |
| | +-----------------------+----------------+------+---------
+----------------+ | | |
| +UndirectedAllRelationshipsScan | (anon_0)-[r]-(anon_1) | 56 | 56 | 28 |
120 | 3/0 | 1.110 | Fused in Pipeline 0 |
+---------------------------------+-----------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 28, total allocated memory: 184

Partitioned Undirected All Relationships Scan Label—new 5.17


The PartitionedUndirectedAllRelationshipsScan is a variant of the UndirectedAllRelationshipsScan
operator used by the parallel runtime. It allows the store to be partitioned into different segments where
each segment can be scanned independently in parallel.

732
Query

CYPHER runtime=parallel
PROFILE
MATCH ()-[r]-() RETURN r

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+--------------------------------------------+----+-----------------------+----------------+------
+---------+------------------------+-----------+---------------+
| Operator | Id | Details | Estimated Rows | Rows |
DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------------------------------+----+-----------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | r | 56 | 56 |
166 | 2/0 | 4.905 | In Pipeline 1 |
| | +----+-----------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +PartitionedUndirectedAllRelationshipsScan | 1 | (anon_0)-[r]-(anon_1) | 56 | 56 |
28 | 9/0 | 1.573 | In Pipeline 0 |
+--------------------------------------------+----+-----------------------+----------------+------
+---------+------------------------+-----------+---------------+

Total database accesses: 194

Directed Relationship Type Scan


The DirectedRelationshipTypeScan operator fetches all relationships and their start and end nodes with a
specific type from the relationship type index.

733
Example 391. DirectedRelationshipTypeScan

Query

PROFILE
MATCH ()-[r: FRIENDS_WITH]->()
RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------------------+-------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------------------+-------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | r | 12 | 12 |
0 | | | | |
| | +-------------------------------------+----------------+------
+---------+----------------+ | | |
| +DirectedRelationshipTypeScan | (anon_0)-[r:FRIENDS_WITH]->(anon_1) | 12 | 12 |
13 | 120 | 2/1 | 0.557 | Fused in Pipeline 0 |
+-------------------------------+-------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 13, total allocated memory: 184

Partitioned Directed Relationship Type Scan Label—new 5.17


The PartitionedDirectedRelationshipTypeScan is a variant of the DirectedRelationshipTypeScan
operator used by the parallel runtime. It allows the index to be partitioned into different segments where
each segment can be scanned independently in parallel.

734
Example 392. PartitionedDirectedRelationshipTypeScan

Query

CYPHER runtime=parallel
PROFILE
MATCH ()-[r: FRIENDS_WITH]->()
RETURN r

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

++------------------------------------------+----+-------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| Operator | Id | Details | Estimated
Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------------------------+----+-------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | r |
12 | 12 | 12 | 0/0 | 0.560 | In Pipeline 1 |
| | +----+-------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +PartitionedDirectedRelationshipTypeScan | 1 | (anon_0)-[r:FRIENDS_WITH]->(anon_1) |
12 | 12 | 13 | 2/0 | 0.167 | In Pipeline 0 |
+------------------------------------------+----+-------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+

Total database accesses: 25

Undirected Relationship Type Scan


The UndirectedRelationshipTypeScan operator fetches all relationships and their start and end nodes with
a specific type from the relationship type index.

735
Query

PROFILE
MATCH ()-[r: FRIENDS_WITH]-()
RETURN r

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------------------+------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------------+------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | r | 24 | 24 |
0 | | | | |
| | +------------------------------------+----------------+------
+---------+----------------+ | | |
| +UndirectedRelationshipTypeScan | (anon_0)-[r:FRIENDS_WITH]-(anon_1) | 24 | 24 |
13 | 120 | 2/1 | 0.749 | Fused in Pipeline 0 |
+---------------------------------+------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 13, total allocated memory: 184

Partitioned Undirected Relationship Type Scan Label—new 5.17


The PartitionedUndirectedRelationshipTypeScan is a variant of the UndirectedRelationshipTypeScan
operator used by the parallel runtime. It allows the index to be partitioned into different segments where
each segment can be scanned independently in parallel.

736
Query

CYPHER runtime=parallel
PROFILE
MATCH ()-[r: FRIENDS_WITH]-()
RETURN r

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+--------------------------------------------+----+------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| Operator | Id | Details | Estimated
Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------------------------------+----+------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | r |
24 | 24 | 24 | 1/0 | 1.466 | In Pipeline 1 |
| | +----+------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +PartitionedUndirectedRelationshipTypeScan | 1 | (anon_0)-[r:FRIENDS_WITH]-(anon_1) |
24 | 24 | 13 | 2/0 | 0.171 | In Pipeline 0 |
+--------------------------------------------+----+------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+

Total database accesses: 37

Directed Union Relationship Types Scan


The DirectedUnionRelationshipTypesScan operator fetches all relationships and their start and end nodes
with at least one of the provided types from the relationship type index.

As the block storage format becomes the default, this operator will cease to be used in

 generating plans. Please refer to Operations Manual → Store formats for futher details on
the various store formats available.

737
Example 393. DirectedUnionRelationshipTypesScan

Query

PROFILE
MATCH ()-[friendOrFoe: FRIENDS_WITH|FOE]->()
RETURN friendOrFoe

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------------------------+---------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+-----------------+---------------------+
| Operator | Details | Estimated
Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by |
Pipeline |
+-------------------------------------+---------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+-----------------+---------------------+
| +ProduceResults | friendOrFoe |
15 | 12 | 0 | | | | |
|
| | +---------------------------------------------------
+----------------+------+---------+----------------+ | |
| |
| +DirectedUnionRelationshipTypesScan | (anon_0)-[friendOrFoe:FRIENDS_WITH|FOE]->(anon_1) |
15 | 12 | 14 | 120 | 3/1 | 2.027 | friendOrFoe ASC | Fused
in Pipeline 0 |
+-------------------------------------+---------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+-----------------+---------------------+

Total database accesses: 14, total allocated memory: 184

Partitioned Directed Union Relationship Types Scan Label—new 5.17


The PartitionedDirectedUnionRelationshipTypeScan is a variant of the
DirectedUnionRelationshipTypesScan operator used by the parallel runtime. It allows the index to be
partitioned into different segments where each segment can be scanned independently in parallel.

As the block storage format becomes the default, this operator will cease to be used in

 generating plans. Please refer to Operations Manual → Store formats for futher details on
the various store formats available.

738
Example 394. PartitionedDirectedUnionRelationshipTypesScan

Query

CYPHER runtime=parallel
PROFILE
MATCH ()-[friendOrFoe: FRIENDS_WITH|FOE]->()
RETURN friendOrFoe

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+------------------------------------------------+----
+---------------------------------------------------+----------------+------+---------
+------------------------+-----------+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------------------------------+----
+---------------------------------------------------+----------------+------+---------
+------------------------+-----------+---------------+
| +ProduceResults | 0 | friendOrFoe
| 15 | 12 | 12 | 0/0 | 0.570 | In Pipeline 1 |
| | +----
+---------------------------------------------------+----------------+------+---------
+------------------------+-----------+---------------+
| +PartitionedDirectedUnionRelationshipTypesScan | 1 | (anon_0)-[friendOrFoe:FRIENDS_WITH|FOE]-
>(anon_1) | 15 | 12 | 13 | 2/0 | 0.170 | In Pipeline 0 |
+------------------------------------------------+----
+---------------------------------------------------+----------------+------+---------
+------------------------+-----------+---------------+

Total database accesses: 25

Undirected Union Relationship Types Scan


The UndirectedUnionRelationshipTypesScan operator fetches all relationships and their start and end
nodes with at least one of the provided types from the relationship type index.

As the block storage format becomes the default, this operator will cease to be used in

 generating plans. Please refer to Operations Manual → Store formats for futher details on
the various store formats available.

739
Example 395. UndirectedUnionRelationshipTypeScan

Query

PROFILE
MATCH ()-[friendOrFoe: FRIENDS_WITH|FOE]-()
RETURN friendOrFoe

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------------------------+--------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+-----------------+---------------------+
| Operator | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by
| Pipeline |
+---------------------------------------+--------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+-----------------+---------------------+
| +ProduceResults | friendOrFoe |
30 | 24 | 0 | | | | |
|
| | +--------------------------------------------------
+----------------+------+---------+----------------+ | |
| |
| +UndirectedUnionRelationshipTypesScan | (anon_0)-[friendOrFoe:FRIENDS_WITH|FOE]-(anon_1) |
30 | 24 | 14 | 120 | 3/1 | 0.887 | friendOrFoe ASC | Fused
in Pipeline 0 |
+---------------------------------------+--------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+-----------------+---------------------+

Total database accesses: 14, total allocated memory: 184

Partitioned Undirected Union Relationship Types Scan Label—new 5.17


The PartitionedUndirectedUnionRelationshipTypeScan is a variant of the
UndirectedUnionRelationshipTypesScan operator used by the parallel runtime. It allows the index to be
partitioned into different segments where each segment can be scanned independently in parallel.

As the block storage format becomes the default, this operator will cease to be used in

 generating plans. Please refer to Operations Manual → Store formats for futher details on
the various store formats available.

740
Example 396. PartitionedUndirectedUnionRelationshipTypesScan

Query

CYPHER runtime=parallel
PROFILE
MATCH ()-[friendOrFoe: FRIENDS_WITH|FOE]-()
RETURN friendOrFoe

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+--------------------------------------------------+----
+--------------------------------------------------+----------------+------+---------
+------------------------+-----------+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------------------------------------+----
+--------------------------------------------------+----------------+------+---------
+------------------------+-----------+---------------+
| +ProduceResults | 0 | friendOrFoe
| 30 | 24 | 24 | 0/0 | 0.896 | In Pipeline 1 |
| | +----
+--------------------------------------------------+----------------+------+---------
+------------------------+-----------+---------------+
| +PartitionedUndirectedUnionRelationshipTypesScan | 1 | (anon_0)-[friendOrFoe:FRIENDS_WITH|FOE]-
(anon_1) | 30 | 24 | 13 | 2/0 | 0.818 | In Pipeline 0 |
+--------------------------------------------------+----
+--------------------------------------------------+----------------+------+---------
+------------------------+-----------+---------------+

Total database accesses: 37

Node By ElementId Seek Label—new 5.3


The NodeByElementIdSeek operator reads one or more nodes by ID from the node store, specified via the
function elementId().

741
Example 397. NodeByElementIdSeek

Query

PROFILE
MATCH (n)
WHERE elementId(n) = 0
RETURN n

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits |
Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | n | 1 | 1 | 0 |
| | | |
| | +-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+----------------------
| | |
| +NodeByElementIdSeek | n WHERE elementId(n) = $autoint_0 | 1 | 1 | 1 |
120 | 3/0 | 2.108 | Fused in Pipeline 0 |
+------------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 1, total allocated memory: 184

Node By Id Seek
The NodeByIdSeek operator reads one or more nodes by id from the node store, specified via the function
id().

742
Example 398. NodeByIdSeek

Query

PROFILE
MATCH (n)
WHERE id(n) = 0
RETURN n

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----+----------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----+----------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | n | 1 | 1 | 2 |
0 | | | |
| | +----+----------------------------+----------------+------+---------
+----------------+ | | |
| +NodeByIdSeek | 1 | n WHERE id(n) = $autoint_0 | 1 | 1 | 1 |
248 | 2/0 | 1.109 | Fused in Pipeline 0 |
+-----------------+----+----------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 3, total allocated memory: 312

1 row

Node By Label Scan


The NodeByLabelScan operator fetches all nodes with a specific label from the node label index.

743
Example 399. NodeByLabelScan

Query

PROFILE
MATCH (person:Person)
RETURN person

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+------------------+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | person | 14 | 14 | 0 | |
| | |
| | +---------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | person:Person | 14 | 14 | 15 | 120 |
2/1 | 0.522 | Fused in Pipeline 0 |
+------------------+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 15, total allocated memory: 184

Partitioned Node By Label Scan Label—new 5.17


The PartitionedNodeByLabelScan is a variant of the NodeByLabelScan operator used by the parallel
runtime. It allows the index to be partitioned into different segments where each segment can be scanned
independently in parallel.

744
Example 400. PartitionedNodeByLabelScan

Query

CYPHER runtime=parallel
PROFILE
MATCH (person:Person)
RETURN person

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+-----------------------------+----+---------------+----------------+------+---------
+------------------------+-----------+---------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------------------+----+---------------+----------------+------+---------
+------------------------+-----------+---------------+
| +ProduceResults | 0 | person | 14 | 14 | 28 |
2/0 | 0.623 | In Pipeline 1 |
| | +----+---------------+----------------+------+---------
+------------------------+-----------+---------------+
| +PartitionedNodeByLabelScan | 1 | person:Person | 14 | 14 | 15 |
1/0 | 0.094 | In Pipeline 0 |
+-----------------------------+----+---------------+----------------+------+---------
+------------------------+-----------+---------------+

Total database accesses: 43

Node Index Seek


The NodeIndexSeek operator finds nodes using an index seek. The node variable and the index used are
shown in the arguments of the operator. If the index is a unique index, the operator is instead called
NodeUniqueIndexSeek.

745
Example 401. NodeIndexSeek

Query

PROFILE
MATCH (location:Location {name: 'Malmo'})
RETURN location

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | location | 1 |
1 | 0 | | | | |
| | +----------------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX location:Location(name) WHERE name = $autostring_0 | 1 |
1 | 2 | 120 | 2/1 | 0.401 | Fused in Pipeline 0 |
+-----------------+----------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 2, total allocated memory: 184

Partitioned Node Index Seek Label—new 5.17


The PartitionedNodeIndexSeek is a variant of the NodeIndexSeek operator used by the parallel runtime. It
allows the index to be partitioned into different segments where each segment can be scanned
independently in parallel.

746
Example 402. PartitionedNodeIndexSeek

Query

CYPHER runtime=parallel
PROFILE
MATCH (location:Location {name: 'Malmo'})
RETURN location

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+---------------------------+----+----------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| Operator | Id | Details |
Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------+----+----------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | location |
0 | 1 | 2 | 2/0 | 0.179 | In Pipeline 1 |
| | +----+----------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +PartitionedNodeIndexSeek | 1 | RANGE INDEX location:Location(name) WHERE name = $autostring_0 |
0 | 1 | 2 | 1/0 | 0.167 | In Pipeline 0 |
+---------------------------+----+----------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+

Total database accesses: 4

Node Unique Index Seek


The NodeUniqueIndexSeek operator finds nodes using an index seek within a unique index. The node
variable and the index used are shown in the arguments of the operator. If the index is not unique, the
operator is instead called NodeIndexSeek. If the index seek is used to solve a MERGE clause, it will also be
marked with (Locking). This makes it clear that any nodes returned from the index will be locked in order
to prevent concurrent conflicting updates.

747
Example 403. NodeUniqueIndexSeek

Query

PROFILE
MATCH (t:Team {name: 'Malmo'})
RETURN t

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------+------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+----------------------+------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | t | 1 | 0 |
0 | | | | |
| | +------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeUniqueIndexSeek | UNIQUE t:Team(name) WHERE name = $autostring_0 | 1 | 0 |
1 | 120 | 0/1 | 0.280 | Fused in Pipeline 0 |
+----------------------+------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 1, total allocated memory: 184

Multi Node Index Seek


The MultiNodeIndexSeek operator finds nodes using multiple index seeks. It supports using multiple
distinct indexes for different nodes in the query. The node variables and the indexes used are shown in the
arguments of the operator.

The operator yields a cartesian product of all index seeks. For example, if the operator does two seeks and
the first seek finds the nodes a1, a2 and the second b1, b2, b3, the MultiNodeIndexSeek will yield the
rows (a1, b1), (a1, b2), (a1, b3), (a2, b1), (a2, b2), (a2, b3).

748
Example 404. MultiNodeIndexSeek

Query

PROFILE
CYPHER runtime=pipelined
MATCH
(location:Location {name: 'Malmo'}),
(person:Person {name: 'Bob'})
RETURN location, person

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------+-----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details | Estimated
Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------+-----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | location, person |
1 | 1 | 0 | | | | |
| | +-----------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +MultiNodeIndexSeek | RANGE INDEX location:Location(name) WHERE name = $autostring_0, |
1 | 0 | 0 | 120 | 2/2 | 1.910 | Fused in Pipeline 0 |
| | RANGE INDEX person:Person(name) WHERE name = $autostring_1 |
| | | | | | |
+---------------------+-----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 0, total allocated memory: 184

Asserting Multi Node Index Seek


The AssertingMultiNodeIndexSeek operator is used to ensure that no property uniqueness constraints are
violated. The example looks for the presence of a team with the supplied name and id, and if one does not
exist, it will be created. Owing to the existence of two property uniqueness constraints on :Team(name) and
:Team(id), any node that would be found by the UniqueIndexSeek operator must be the very same node or
the constraints would be violated.

749
Example 405. AssertingMultiNodeIndexSeek

Query

PROFILE
MERGE (t:Team {name: 'Engineering', id: 42})

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------------
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+------------------------------
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults |
| 1 | 0 | 0 | | | |
|
| |
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +EmptyResult |
| 1 | 0 | 0 | | | |
|
| |
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Merge | CREATE (t:Team {name: $autostring_0, id: $autoint_1})
| 1 | 1 | 0 | | | |
|
| |
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +AssertingMultiNodeIndexSeek | UNIQUE t:Team(name) WHERE name = $autostring_0, UNIQUE t:Team(id)
WHERE id = $autoint_1 | 0 | 2 | 4 | 120 | 0/2 |
1.584 | Fused in Pipeline 0 |
+------------------------------
+-----------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 4, total allocated memory: 184

Node Index Seek By Range


The NodeIndexSeekByRange operator finds nodes using an index seek where the value of the property
matches a given prefix STRING. NodeIndexSeekByRange can be used for STARTS WITH and comparison
operators such as <, >, <= and >=. If the index is a unique index, the operator is instead called
NodeUniqueIndexSeekByRange.

750
Example 406. NodeIndexSeekByRange

Query

PROFILE
MATCH (l:Location)
WHERE l.name STARTS WITH 'Lon'
RETURN l

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------+-------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------------+-------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | l |
2 | 1 | 0 | | | | |
| | +-------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexSeekByRange | RANGE INDEX l:Location(name) WHERE name STARTS WITH $autostring_0 |
2 | 1 | 2 | 120 | 3/0 | 0.825 | Fused in Pipeline 0 |
+-----------------------+-------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 2, total allocated memory: 184

Partitioned Node Index Seek By Range Label—new 5.17


The PartitionedNodeIndexSeekByRange is a variant of the NodeIndexSeekByRange operator used by the
parallel runtime. It allows the index to be partitioned into different segments where each segment can be
scanned independently in parallel.

751
Example 407. PartitionedNodeIndexSeekByRange

Query

CYPHER runtime=parallel
PROFILE
MATCH (l:Location)
WHERE l.name STARTS WITH 'Lon'
RETURN l

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+----------------------------------+----
+-------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+----------------------------------+----
+-------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | l
| 0 | 1 | 2 | 2/0 | 0.191 | In Pipeline 1 |
| | +----
+-------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+
| +PartitionedNodeIndexSeekByRange | 1 | RANGE INDEX l:Location(name) WHERE name STARTS WITH
$autostring_0 | 0 | 1 | 2 | 1/0 | 0.087 | In Pipeline 0
|
+----------------------------------+----
+-------------------------------------------------------------------+----------------+------
+---------+------------------------+-----------+---------------+

Total database accesses: 4

Node Unique Index Seek By Range


The NodeUniqueIndexSeekByRange operator finds nodes using an index seek within a unique index, where
the value of the property matches a given prefix STRING. NodeUniqueIndexSeekByRange is used by STARTS
WITH and comparison operators such as <, >, <=, and >=. If the index is not unique, the operator is instead
called NodeIndexSeekByRange.

752
Example 408. NodeUniqueIndexSeekByRange

Query

PROFILE
MATCH (t:Team)
WHERE t.name STARTS WITH 'Ma'
RETURN t

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------------+----------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details | Estimated
Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------------------+----------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | t |
2 | 0 | 0 | | | | |
| | +----------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeUniqueIndexSeekByRange | UNIQUE t:Team(name) WHERE name STARTS WITH $autostring_0 |
2 | 0 | 1 | 120 | 1/0 | 0.623 | Fused in Pipeline 0 |
+-----------------------------+----------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 1, total allocated memory: 184

Node Index Contains Scan


The NodeIndexContainsScan operator examines all values stored in an index, searching for entries
containing a specific STRING; for example, in queries including CONTAINS. Although this is slower than an
index seek (since all entries need to be examined), it is still faster than the indirection resulting from a label
scan using NodeByLabelScan, and a property store filter.

753
Example 409. NodeIndexContainsScan

Query

PROFILE
MATCH (l:Location)
WHERE l.name CONTAINS 'al'
RETURN l

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------+---------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details | Estimated
Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------+---------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | l |
0 | 2 | 0 | | | | |
| | +---------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexContainsScan | TEXT INDEX l:Location(name) WHERE name CONTAINS $autostring_0 |
0 | 2 | 3 | 120 | 2/0 | 1.305 | Fused in Pipeline 0 |
+------------------------+---------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 3, total allocated memory: 184

Node Index Ends With Scan


The NodeIndexEndsWithScan operator examines all values stored in an index, searching for entries ending
in a specific STRING; for example, in queries containing ENDS WITH. Although this is slower than an index
seek (since all entries need to be examined), it is still faster than the indirection resulting from a label scan
using NodeByLabelScan, and a property store filter.

754
Example 410. NodeIndexEndsWithScan

Query

PROFILE
MATCH (l:Location)
WHERE l.name ENDS WITH 'al'
RETURN l

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------+----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details | Estimated
Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------+----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | l |
0 | 0 | 0 | | | | |
| | +----------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexEndsWithScan | TEXT INDEX l:Location(name) WHERE name ENDS WITH $autostring_0 |
0 | 0 | 1 | 120 | 0/0 | 4.409 | Fused in Pipeline 0 |
+------------------------+----------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 1, total allocated memory: 184

Node Index Scan


The NodeIndexScan operator examines all values stored in an index, returning all nodes with a particular
label and a specified property.

755
Example 411. NodeIndexScan

Query

PROFILE
MATCH (l:Location)
WHERE l.name IS NOT NULL
RETURN l

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+-----------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+-----------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | l | 10 | 10 |
0 | | | | |
| | +-----------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeIndexScan | RANGE INDEX l:Location(name) WHERE name IS NOT NULL | 10 | 10 |
11 | 120 | 2/1 | 0.557 | Fused in Pipeline 0 |
+-----------------+-----------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 11, total allocated memory: 184

Partitioned Node Index Scan Label—new 5.17


The PartitionedNodeIndexScan is a variant of the NodeIndexScan operator used by the parallel runtime. It
allows the index to be partitioned into different segments where each segment can be scanned
independently in parallel.

756
Example 412. PartitionedNodeIndexScan

Query

CYPHER runtime=parallel
PROFILE
MATCH (l:Location)
WHERE l.name IS NOT NULL
RETURN l

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+---------------------------+----+-----------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| Operator | Id | Details | Estimated
Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------+----+-----------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +ProduceResults | 0 | l |
1 | 10 | 20 | 2/0 | 0.472 | In Pipeline 1 |
| | +----+-----------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+
| +PartitionedNodeIndexScan | 1 | RANGE INDEX l:Location(name) WHERE name IS NOT NULL |
1 | 10 | 11 | 1/0 | 0.187 | In Pipeline 0 |
+---------------------------+----+-----------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------+

Total database accesses: 31

Apply
All the different Apply operators (listed below) share the same basic functionality: they perform a nested
loop by taking a single row from the left-hand side, and using the Argument operator on the right-hand
side, execute the operator tree on the right-hand side. The versions of the Apply operators differ in how
the results are managed. The Apply operator (i.e. the standard version) takes the row produced by the
right-hand side — which at this point contains data from both the left-hand and right-hand sides — and
yields it.

757
Example 413. Apply

Query

PROFILE
MATCH (p:Person {name: 'me'})
MATCH (q:Person {name: p.secondName})
RETURN p, q

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | p, q | 1 | 0 |
0 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Apply | | 1 | 0 |
0 | | | | |
| |\ +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| | +NodeIndexSeek | RANGE INDEX q:Person(name) WHERE name = p.secondName | 1 | 0 |
0 | 2152 | 0/0 | 0.219 | Fused in Pipeline 1 |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +NodeIndexSeek | RANGE INDEX p:Person(name) WHERE name = $autostring_0 | 1 | 1 |
2 | 120 | 0/1 | 0.236 | In Pipeline 0 |
+------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 2, total allocated memory: 2216

Semi Apply
The SemiApply operator tests for the presence of a pattern predicate, and is a variation of the Apply
operator. If the right-hand side operator yields at least one row, the row from the left-hand side operator is
yielded by the SemiApply operator. This makes SemiApply a filtering operator, used mostly for pattern
predicates in queries.

758
Example 414. SemiApply

Query

PROFILE
CYPHER runtime=slotted
MATCH (p:Person)
WHERE (p)-[:FRIENDS_WITH]->(:Person)
RETURN p.name

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------+-------------------------------------+----------------+------+---------
+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Page
Cache Hits/Misses |
+-----------------+-------------------------------------+----------------+------+---------
+------------------------+
| +ProduceResults | `p.name` | 11 | 10 | 0 |
0/0 |
| | +-------------------------------------+----------------+------+---------
+------------------------+
| +Projection | p.name AS `p.name` | 11 | 10 | 10 |
1/0 |
| | +-------------------------------------+----------------+------+---------
+------------------------+
| +SemiApply | | 11 | 10 | 0 |
0/0 |
| |\ +-------------------------------------+----------------+------+---------
+------------------------+
| | +Filter | anon_3:Person | 12 | 0 | 10 |
0/0 |
| | | +-------------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (p)-[anon_2:FRIENDS_WITH]->(anon_3) | 12 | 10 | 51 |
28/0 |
| | | +-------------------------------------+----------------+------+---------
+------------------------+
| | +Argument | p | 14 | 14 | 0 |
0/0 |
| | +-------------------------------------+----------------+------+---------
+------------------------+
| +NodeByLabelScan| p:Person | 14 | 14 | 35 |
1/0 |
+-----------------+-------------------------------------+----------------+------+---------
+------------------------+

Total database accesses: 142, total allocated memory: 64

Anti Semi Apply


The AntiSemiApply operator tests for the absence of a pattern, and is a variation of the Apply operator. If
the right-hand side operator yields no rows, the row from the left-hand side operator is yielded by the
AntiSemiApply operator. This makes AntiSemiApply a filtering operator, used for pattern predicates in
queries.

759
Example 415. AntiSemiApply

Query

PROFILE
CYPHER runtime=slotted
MATCH
(me:Person {name: 'me'}),
(other:Person)
WHERE NOT (me)-[:FRIENDS_WITH]->(other)
RETURN other.name

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-------------------+--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| Operator | Details | Estimated Rows | Rows
| DB Hits | Memory (Bytes) | Page Cache Hits/Misses |
+-------------------+--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| +ProduceResults | `other.name` | 4 | 12
| 0 | | 0/0 |
| | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| +Projection | other.name AS `other.name` | 4 | 12
| 12 | | 1/0 |
| | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| +AntiSemiApply | | 4 | 12
| 0 | | 0/0 |
| |\ +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| | +Expand(Into) | (me)-[anon_2:FRIENDS_WITH]->(other) | 1 | 0
| 81 | 896 | 28/0 |
| | | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| | +Argument | me, other | 14 | 14
| 0 | | 0/0 |
| | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| +CartesianProduct | | 14 | 14
| 0 | | 0/0 |
| |\ +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| | +NodeByLabelScan| other:Person | 14 | 14
| 35 | | 1/0 |
| | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+
| +NodeIndexSeek | RANGE INDEX me:Person(name) WHERE name = $autostring_0 | 1 | 1
| 2 | | 0/1 |
+-------------------+--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+

Total database accesses: 166, total allocated memory: 976

TransactionApply
TransactionApply works like the Apply operator but will commit the current transaction after a specified
number of rows.

760
Example 416. TransactionApply

The below query uses a variable scope clause (introduced in Neo4j 5.23) to import

 variables into the CALL subquery. If you are using an older version of Neo4j, use an
importing WITH clause instead.

Query

PROFILE
LOAD CSV FROM 'https://neo4j.com/docs/cypher-refcard/3.3/csv/artists.csv' AS line
CALL (line) {
CREATE (a: Artist {name: line[0]})
RETURN a
} IN TRANSACTIONS OF 100 ROWS
RETURN a;

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------+----+--------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+--------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | a | 10 | 4 |
8 | 0 | | | |
| | +----+--------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +TransactionApply | 1 | IN TRANSACTIONS OF $autoint_1 ROWS ON ERROR FAIL | 10 | 4 |
0 | 2152 | 0/0 | 2.036 | Fused in Pipeline 3 |
| |\ +----+--------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| | +Create | 2 | (a:Artist {name: line[$autoint_0]}) | 10 | 4 |
16 | | | | |
| | | +----+--------------------------------------------------+----------------+------
+---------+----------------+ | | |
| | +Argument | 3 | line | 10 | 4 |
0 | 3472 | 0/0 | 32.746 | Fused in Pipeline 2 |
| | +----+--------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +LoadCSV | 4 | line | 10 | 4 |
0 | 328 | | | In Pipeline 1 |
+-------------------+----+--------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 24, total allocated memory: 5472

Anti
The Anti operator tests for the absence of a pattern. If there are incoming rows, the Anti operator will yield
no rows. If there are no incoming rows, the Anti operator will yield a single row.

761
Example 417. Anti

Query

PROFILE
CYPHER runtime=pipelined
MATCH
(me:Person {name: 'me'}),
(other:Person)
WHERE NOT (me)-[:FRIENDS_WITH]->(other)
RETURN other.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------+--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows
| DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | `other.name` | 4 | 12
| 0 | | 0/0 | 0.068 | |
| | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+ |
| +Projection | other.name AS `other.name` | 4 | 12
| 24 | | 2/0 | 0.111 | In Pipeline 4 |
| | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +Apply | | 4 | 12
| 0 | | 0/0 | | |
| |\ +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| | +Anti | | 4 | 12
| 0 | 1256 | 0/0 | 0.084 | In Pipeline 4 |
| | | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| | +Limit | 1 | 11 | 2
| 0 | 752 | | | |
| | | +--------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| | +Expand(Into) | (me)-[anon_2:FRIENDS_WITH]->(other) | 1 | 2
| 81 | 2632 | | | |
| | | +--------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| | +Argument | me, other | 14 | 14
| 0 | 3192 | 1/0 | 0.904 | Fused in Pipeline 3 |
| | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +CartesianProduct | | 14 | 14
| 0 | 3672 | | 1.466 | In Pipeline 2 |
| |\ +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| | +NodeByLabelScan| other:Person | 14 | 14
| 35 | | | | |
| | +--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +NodeIndexSeek | RANGE INDEX me:Person(name) WHERE name = $autostring_0 | 1 | 1
| 2 | 120 | 0/1 | 0.493 | In Pipeline 0 |
+-------------------+--------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 178, total allocated memory: 6744

762
Let Semi Apply
The LetSemiApply operator tests for the presence of a pattern predicate, and is a variation of the Apply
operator. When a query contains multiple pattern predicates separated with OR, LetSemiApply will be used
to evaluate the first of these. It will record the result of evaluating the predicate but will leave any filtering
to another operator. In the example, LetSemiApply will be used to check for the presence of the
FRIENDS_WITH relationship from each person.

763
Example 418. LetSemiApply

Query

PROFILE
CYPHER runtime=slotted
MATCH (other:Person)
WHERE (other)-[:FRIENDS_WITH]->(:Person) OR (other)-[:WORKS_IN]->(:Location)
RETURN other.name

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+--------------------+-----------------------------------------+----------------+------+---------
+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits |
Page Cache Hits/Misses |
+--------------------+-----------------------------------------+----------------+------+---------
+------------------------+
| +ProduceResults | `other.name` | 13 | 14 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +Projection | other.name AS `other.name` | 13 | 14 | 14 |
1/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +SelectOrSemiApply | anon_9 | 14 | 14 | 0 |
0/0 |
| |\ +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Filter | anon_7:Location | 14 | 0 | 4 |
0/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (other)-[anon_6:WORKS_IN]->(anon_7) | 14 | 4 | 15 |
8/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Argument | other | 14 | 4 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +LetSemiApply | | 14 | 14 | 0 |
0/0 |
| |\ +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Filter | anon_5:Person | 12 | 0 | 10 |
0/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (other)-[anon_4:FRIENDS_WITH]->(anon_5) | 12 | 10 | 51 |
28/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Argument | other | 14 | 14 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +NodeByLabelScan | other:Person | 14 | 14 | 35 |
1/0 |
+--------------------+-----------------------------------------+----------------+------+---------
+------------------------+

Total database accesses: 165, total allocated memory: 64

764
Let Anti Semi Apply
The LetAntiSemiApply operator tests for the absence of a pattern, and is a variation of the Apply operator.
When a query contains multiple negated pattern predicates — i.e. predicates separated with OR, where at
least one predicate contains NOT — LetAntiSemiApply will be used to evaluate the first of these. It will
record the result of evaluating the predicate but will leave any filtering to another operator. In the example,
LetAntiSemiApply will be used to check for the absence of the FRIENDS_WITH relationship from each person.

765
Example 419. LetAntiSemiApply

Query

PROFILE
CYPHER runtime=slotted
MATCH (other:Person)
WHERE NOT ((other)-[:FRIENDS_WITH]->(:Person)) OR (other)-[:WORKS_IN]->(:Location)
RETURN other.name

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25


+--------------------+-----------------------------------------+----------------+------+---------
+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits |
Page Cache Hits/Misses |
+--------------------+-----------------------------------------+----------------+------+---------
+------------------------+
| +ProduceResults | `other.name` | 11 | 14 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +Projection | other.name AS `other.name` | 11 | 14 | 14 |
1/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +SelectOrSemiApply | anon_9 | 14 | 14 | 0 |
0/0 |
| |\ +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Filter | anon_7:Location | 14 | 0 | 10 |
0/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (other)-[anon_6:WORKS_IN]->(anon_7) | 14 | 10 | 38 |
20/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Argument | other | 14 | 10 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +LetAntiSemiApply | | 14 | 14 | 0 |
0/0 |
| |\ +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Filter | anon_5:Person | 12 | 0 | 10 |
0/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (other)-[anon_4:FRIENDS_WITH]->(anon_5) | 12 | 10 | 51 |
28/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Argument | other | 14 | 14 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +NodeByLabelScan | p:Person | 14 | 14 | 35 |
1/0 |
+--------------------+-----------------------------------------+----------------+------+---------
+------------------------+

Total database accesses: 142, total allocated memory: 64

766
Select Or Semi Apply
The SelectOrSemiApply operator tests for the presence of a pattern predicate and evaluates a predicate,
and is a variation of the Apply operator. This operator allows for the mixing of normal predicates and
pattern predicates that check for the presence of a pattern. First, the normal expression predicate is
evaluated, and, only if it returns false, is the costly pattern predicate evaluated.

Example 420. SelectOrSemiApply

Query

PROFILE
MATCH (other:Person)
WHERE other.age > 25 OR (other)-[:FRIENDS_WITH]->(:Person)
RETURN other.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------+-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits |
Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------+-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | `other.name` | 11 | 10 | 0 |
| | | |
| | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| +Projection | other.name AS `other.name` | 11 | 10 | 20 |
| | | |
| | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| +SelectOrSemiApply | other.age > $autoint_0 | 14 | 10 | 0 |
392 | 0/0 | 0.190 | Fused in Pipeline 2 |
| |\ +-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| | +Limit | 1 | 14 | 10 | 0 |
752 | | | |
| | | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Filter | anon_3:Person | 12 | 10 | 20 |
| | | |
| | | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Expand(All) | (other)-[anon_2:FRIENDS_WITH]->(anon_3) | 12 | 10 | 37 |
| | | |
| | | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Argument | other | 14 | 14 | 0 |
2168 | 2/0 | 0.435 | Fused in Pipeline 1 |
| | +-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +NodeByLabelScan | other:Person | 14 | 14 | 35 |
| | | Fused in Pipeline 0 |
+--------------------+-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 148, total allocated memory: 2952

767
Select Or Anti Semi Apply
The SelectOrAntiSemiApply operator is used to evaluate OR between a predicate and a negative pattern
predicate (i.e. a pattern predicate preceded with NOT), and is a variation of the Apply operator. If the
predicate returns true, the pattern predicate is not tested. If the predicate returns false or null,
SelectOrAntiSemiApply will instead test the pattern predicate.

768
Example 421. SelectOrAntiSemiApply

Query

PROFILE
MATCH (other:Person)
WHERE other.age > 25 OR NOT (other)-[:FRIENDS_WITH]->(:Person)
RETURN other.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------+-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits
| Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------+-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | `other.name` | 4 | 4 | 0
| | | | |
| | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| +Projection | other.name AS `other.name` | 4 | 4 | 8
| | | | |
| | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| +SelectOrAntiSemiApply | other.age > $autoint_0 | 14 | 4 | 0
| 200 | 0/0 | 0.155 | Fused in Pipeline 3 |
| |\ +-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| | +Anti | | 14 | 4 | 0
| 1256 | 0/0 | 0.170 | In Pipeline 2 |
| | | +-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| | +Limit | 1 | 0 | 10 | 0
| 752 | | | |
| | | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Filter | anon_3:Person | 12 | 10 | 20
| | | | |
| | | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Expand(All) | (other)-[anon_2:FRIENDS_WITH]->(anon_3) | 12 | 10 | 37
| | | | |
| | | +-----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Argument | other | 14 | 14 | 0
| 2168 | 2/0 | 0.449 | Fused in Pipeline 1 |
| | +-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +NodeByLabelScan | other:Person | 14 | 14 | 35
| | | | Fused in Pipeline 0 |
+------------------------+-----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 136, total allocated memory: 4208

Let Select Or Semi Apply


The LetSelectOrSemiApply operator is planned for pattern predicates that are combined with other
predicates using OR. This is a variation of the Apply operator.

769
Example 422. LetSelectOrSemiApply

Query

PROFILE
CYPHER runtime=slotted
MATCH (other:Person)
WHERE (other)-[:FRIENDS_WITH]->(:Person) OR (other)-[:WORKS_IN]->(:Location) OR other.age = 5
RETURN other.name

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------------+-----------------------------------------+----------------+------+---------
+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits |
Page Cache Hits/Misses |
+-----------------------+-----------------------------------------+----------------+------+---------
+------------------------+
| +ProduceResults | `other.name` | 13 | 14 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +Projection | other.name AS `other.name` | 13 | 14 | 14 |
1/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +SelectOrSemiApply | anon_9 | 14 | 14 | 0 |
0/0 |
| |\ +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Filter | anon_7:Location | 14 | 0 | 4 |
0/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (other)-[anon_6:WORKS_IN]->(anon_7) | 14 | 4 | 15 |
8/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Argument | other | 14 | 4 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +LetSelectOrSemiApply | other.age = $autoint_0 | 14 | 14 | 14 |
0/0 |
| |\ +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Filter | anon_5:Person | 12 | 0 | 10 |
0/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (other)-[anon_4:FRIENDS_WITH]->(anon_5) | 12 | 10 | 51 |
28/0 |
| | | +-----------------------------------------+----------------+------+---------
+------------------------+
| | +Argument | other | 14 | 14 | 0 |
0/0 |
| | +-----------------------------------------+----------------+------+---------
+------------------------+
| +NodeByLabelScan | other:Person | 14 | 14 | 35 |
1/0 |
+-----------------------+-----------------------------------------+----------------+------+---------
+------------------------+

Total database accesses: 179, total allocated memory: 64

770
Let Select Or Anti Semi Apply
The LetSelectOrAntiSemiApply operator is planned for negated pattern predicates — i.e. pattern
predicates preceded with NOT — that are combined with other predicates using OR. This operator is a
variation of the Apply operator.

771
Example 423. LetSelectOrAntiSemiApply

Query

PROFILE
CYPHER runtime=slotted
MATCH (other:Person)
WHERE NOT (other)-[:FRIENDS_WITH]->(:Person) OR (other)-[:WORKS_IN]->(:Location) OR other.age = 5
RETURN other.name

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+---------------------------+-----------------------------------------+----------------+------
+---------+------------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Page Cache Hits/Misses |
+---------------------------+-----------------------------------------+----------------+------
+---------+------------------------+
| +ProduceResults | `other.name` | 12 | 14 |
0 | 0/0 |
| | +-----------------------------------------+----------------+------
+---------+------------------------+
| +Projection | other.name AS `other.name` | 12 | 14 |
14 | 1/0 |
| | +-----------------------------------------+----------------+------
+---------+------------------------+
| +SelectOrSemiApply | anon_9 | 14 | 14 |
0 | 0/0 |
| |\ +-----------------------------------------+----------------+------
+---------+------------------------+
| | +Filter | anon_7:Location | 14 | 0 |
10 | 0/0 |
| | | +-----------------------------------------+----------------+------
+---------+------------------------+
| | +Expand(All) | (other)-[anon_6:WORKS_IN]->(anon_7) | 14 | 10 |
38 | 20/0 |
| | | +-----------------------------------------+----------------+------
+---------+------------------------+
| | +Argument | other | 14 | 10 |
0 | 0/0 |
| | +-----------------------------------------+----------------+------
+---------+------------------------+
| +LetSelectOrAntiSemiApply | other.age = $autoint_0 | 14 | 14 |
14 | 0/0 |
| |\ +-----------------------------------------+----------------+------
+---------+------------------------+
| | +Filter | anon_5:Person | 12 | 0 |
10 | 0/0 |
| | | +-----------------------------------------+----------------+------
+---------+------------------------+
| | +Expand(All) | (other)-[anon_4:FRIENDS_WITH]->(anon_5) | 12 | 10 |
51 | 28/0 |
| | | +-----------------------------------------+----------------+------
+---------+------------------------+
| | +Argument | other | 14 | 14 |
0 | 0/0 |
| | +-----------------------------------------+----------------+------
+---------+------------------------+
| +NodeByLabelScan | other:Person | 14 | 14 |
35 | 1/0 |
+---------------------------+-----------------------------------------+----------------+------
+---------+------------------------+

Total database accesses: 208, total allocated memory: 64

772
Merge
The Merge operator will either read or create nodes and/or relationships.

If matches are found it will execute the provided ON MATCH operations foreach incoming row. If no matches
are found instead nodes and relationships are created and all ON CREATE operations are run.

Example 424. Merge

Query

PROFILE
MERGE (p:Person {name: 'Andy'})
ON MATCH SET p.existed = true
ON CREATE SET p.existed = false

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+-------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------+-------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | |
1 | 0 | 0 | | | | |
| | +-------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +EmptyResult | |
1 | 0 | 0 | | | | |
| | +-------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Merge | CREATE (p:Person {name: $autostring_0}), ON MATCH SET p.existed = true, |
1 | 1 | 2 | | | | |
| | | ON CREATE SET p.existed = false |
| | | | | | |
| | +-------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexSeek | RANGE INDEX p:Person(name) WHERE name = $autostring_0 |
1 | 1 | 2 | 120 | 2/1 | 0.749 | Fused in Pipeline 0 |
+-----------------+-------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 4, total allocated memory: 184

Locking Merge
The LockingMerge operator is similar to the Merge operator but will lock the start and end node when
creating a relationship if necessary.

773
Example 425. LockingMerge

Query

PROFILE
MATCH (s:Person {name: 'me'})
MERGE (s)-[:FRIENDS_WITH]->(s)

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----+-------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----+-------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | | 1 |
0 | 0 | | | | |
| | +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +EmptyResult | 1 | | 1 |
0 | 0 | | | | |
| | +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Apply | 2 | | 1 |
1 | 0 | | | | |
| |\ +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +LockingMerge | 3 | CREATE (s)-[anon_0:FRIENDS_WITH]->(s), LOCK(s) | 1 |
1 | 1 | | | | |
| | | +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Expand(Into) | 4 | (s)-[anon_0:FRIENDS_WITH]->(s) | 0 |
0 | 10 | 904 | | | |
| | | +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Argument | 5 | s | 1 |
3 | 0 | 2280 | 2/0 | 0.460 | Fused in Pipeline 1 |
| | +----+-------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +NodeIndexSeek | 6 | RANGE INDEX s:Person(name) WHERE name = $autostring_0 | 1 |
1 | 2 | 376 | 1/0 | 0.211 | In Pipeline 0 |
+-----------------+----+-------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 15, total allocated memory: 2232

Roll Up Apply
The RollUpApply operator is used to execute an expression which takes as input a pattern, and returns a
list with content from the matched pattern; for example, when using a pattern expression or pattern
comprehension in a query. This operator is a variation of the Apply operator.

774
Example 426. RollUpApply

Query

PROFILE
CYPHER runtime=slotted
MATCH (p:Person)
RETURN p.name, [(p)-[:WORKS_IN]->(location) | location.name] AS cities

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------+-----------------------------------+----------------+------+---------
+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Page Cache
Hits/Misses |
+-----------------+-----------------------------------+----------------+------+---------
+------------------------+
| +ProduceResults | `p.name`, cities | 14 | 14 | 0 |
0/0 |
| | +-----------------------------------+----------------+------+---------
+------------------------+
| +Projection | p.name AS `p.name` | 14 | 14 | 14 |
0/0 |
| | +-----------------------------------+----------------+------+---------
+------------------------+
| +RollUpApply | cities, anon_0 | 14 | 14 | 0 |
0/0 |
| |\ +-----------------------------------+----------------+------+---------
+------------------------+
| | +Projection | location.name AS anon_0 | 15 | 15 | 15 |
1/0 |
| | | +-----------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (p)-[anon_2:WORKS_IN]->(location) | 15 | 15 | 53 |
28/0 |
| | | +-----------------------------------+----------------+------+---------
+------------------------+
| | +Argument | p | 14 | 14 | 0 |
0/0 |
| | +-----------------------------------+----------------+------+---------
+------------------------+
| +NodeByLabelScan| p:Person | 14 | 14 | 35 |
1/0 |
+-----------------+-----------------------------------+----------------+------+---------
+------------------------+

Total database accesses: 153, total allocated memory: 64

Argument
The Argument operator indicates the variable to be used as an argument to the right-hand side of an Apply
operator.

775
Example 427. Argument

Query

PROFILE
MATCH (s:Person {name: 'me'})
MERGE (s)-[:FRIENDS_WITH]->(s)

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----+-------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----+-------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | | 1 |
0 | 0 | | | | |
| | +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +EmptyResult | 1 | | 1 |
0 | 0 | | | | |
| | +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Apply | 2 | | 1 |
1 | 0 | | | | |
| |\ +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +LockingMerge | 3 | CREATE (s)-[anon_0:FRIENDS_WITH]->(s), LOCK(s) | 1 |
1 | 1 | | | | |
| | | +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Expand(Into) | 4 | (s)-[anon_0:FRIENDS_WITH]->(s) | 0 |
0 | 10 | 904 | | | |
| | | +----+-------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Argument | 5 | s | 1 |
3 | 0 | 2280 | 2/0 | 0.460 | Fused in Pipeline 1 |
| | +----+-------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +NodeIndexSeek | 6 | RANGE INDEX s:Person(name) WHERE name = $autostring_0 | 1 |
1 | 2 | 376 | 1/0 | 0.211 | In Pipeline 0 |
+-----------------+----+-------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 15, total allocated memory: 2232

Argument Tracker
The ArgumentTracker operator is used to ensure row-by-row semantics. This restricts the Cypher runtime
to not batch operations in larger chunks.

776
Example 428. ArgumentTracker

The below query uses a variable scope clause (introduced in Neo4j 5.23) to import

 variables into the CALL subquery. If you are using an older version of Neo4j, use an
importing WITH clause instead.

Query

PROFILE
MATCH (s:Person {name: 'me'})
CALL (s) {
SET s.seen = coalesce(s.seen + 1,1)
RETURN s.seen AS result
}
RETURN result;

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

+--------------------+----+---------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows
| Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------+----+---------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | result | 1
| 1 | 0 | 0 | | | |
| | +----+---------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Apply | 1 | | 1
| 0 | 0 | | | | |
| |\ +----+---------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +ArgumentTracker | 7 | | 1
| 0 | 0 | 736 | | | |
| | | +----+---------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Projection | 2 | s.seen AS result | 1
| 1 | 2 | | | | |
| | | +----+---------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Eager | 3 | read/set conflict for property: seen (Operator: 4 vs 2) | 1
| 1 | 0 | 976 | 0/0 | 0.298 | Fused in Pipeline 2 |
| | | +----+---------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| | +SetProperty | 4 | s.seen = coalesce(s.seen + $autoint_1, $autoint_2) | 1
| 1 | 2 | | | | |
| | | +----+---------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Argument | 5 | s | 1
| 1 | 0 | 2408 | 2/0 | 1.734 | Fused in Pipeline 1 |
| | +----+---------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +NodeIndexSeek | 6 | RANGE INDEX s:Person(name) WHERE name = $autostring_0 | 1
| 1 | 2 | 368 | 1/0 | 0.183 | In Pipeline 0 |
+--------------------+----+---------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 6, total allocated memory: 4136

777
Expand All
Given a start node, and depending on the pattern relationship, the Expand(All) operator will traverse
incoming or outgoing relationships.

Example 429. Expand(All)

Query

PROFILE
MATCH (p:Person {name: 'me'})-[:FRIENDS_WITH]->(fof)
RETURN fof

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | fof | 1 | 2 |
0 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (p)-[anon_0:FRIENDS_WITH]->(fof) | 1 | 2 |
5 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX p:Person(name) WHERE name = $autostring_0 | 1 | 1 |
2 | 120 | 4/1 | 1.137 | Fused in Pipeline 0 |
+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 7, total allocated memory: 184

Expand Into
When both the start and end node have already been found, the Expand(Into) operator is used to find all
relationships connecting the two nodes. As both the start and end node of the relationship are already in
scope, the node with the smallest degree will be used. This can make a noticeable difference when dense
nodes appear as end points.

778
Example 430. Expand(Into)

Query

PROFILE
MATCH (p:Person {name: 'me'})-[:FRIENDS_WITH]->(fof)-->(p)
RETURN fof

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | fof | 0 | 0 |
0 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | not anon_1 = anon_0 | 0 | 0 |
0 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(Into) | (p)-[anon_0:FRIENDS_WITH]->(fof) | 0 | 0 |
6 | 896 | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | (p)<-[anon_1]-(fof) | 1 | 1 |
5 | | | | |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX p:Person(name) WHERE name = $autostring_0 | 1 | 1 |
2 | 120 | 4/1 | 0.546 | Fused in Pipeline 0 |
+-----------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 13, total allocated memory: 976

Optional Expand All


The OptionalExpand(All) operator is analogous to Expand(All), apart from when no relationships match
the direction, type and property predicates. In this situation, OptionalExpand(all) will return a single row
with the relationship and end node set to null.

779
Example 431. OptionalExpand(All)

Query

PROFILE
MATCH (p:Person)
OPTIONAL MATCH (p)-[works_in:WORKS_IN]->(l)
WHERE works_in.duration > 180
RETURN p, l

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------+-------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details |
Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+----------------------+-------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | p, l |
14 | 15 | 1 | | | | |
| | +-------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +OptionalExpand(All) | (p)-[works_in:WORKS_IN]->(l) WHERE works_in.duration > $autoint_0 |
14 | 15 | 53 | | | | |
| | +-------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeByLabelScan | p:Person |
14 | 14 | 15 | 120 | 5/0 | 1,233 | Fused in Pipeline 0 |
+----------------------+-------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 125, total allocated memory: 184

Optional Expand Into


The OptionalExpand(Into) operator is analogous to Expand(Into), apart from when no matching
relationships are found. In this situation, OptionalExpand(Into) will return a single row with the
relationship and end node set to null. As both the start and end node of the relationship are already in
scope, the node with the smallest degree will be used. This can make a noticeable difference when dense
nodes appear as end points.

780
Example 432. OptionalExpand(Into)

Query

PROFILE
MATCH (p:Person)-[works_in:WORKS_IN]->(l)
OPTIONAL MATCH (l)-->(p)
RETURN p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------------+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | p | 15 | 15 | 0 |
| | | |
| | +------------------------------+----------------+------+---------
+----------------+ | | |
| +OptionalExpand(Into) | (l)-[anon_0]->(p) | 15 | 15 | 105 |
3360 | | | |
| | +------------------------------+----------------+------+---------
+----------------+ | | |
| +Expand(All) | (p)-[works_in:WORKS_IN]->(l) | 15 | 15 | 39 |
| | | |
| | +------------------------------+----------------+------+---------
+----------------+ | | |
| +NodeByLabelScan | p:Person | 14 | 14 | 15 |
120 | 7/0 | 3,925 | Fused in Pipeline 0 |
+-----------------------+--- --------------------------+----------------+------+---------+
----------------+------------------------+-----------+---------------------+

Total database accesses: 215, total allocated memory: 3440

VarLength Expand All


Given a start node, the VarLengthExpand(All) operator will traverse variable-length and quantified
relationships.

781
Example 433. VarLengthExpand(All)

Query

PROFILE
MATCH (p:Person)-[:FRIENDS_WITH *1..2]-(q:Person)
RETURN p, q

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits |
Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | p, q | 40 | 48 | 0 |
| | | |
| | +-----------------------------------+----------------+------+---------
+----------------+ | | |
| +Filter | q:Person | 40 | 48 | 96 |
| | | |
| | +-----------------------------------+----------------+------+---------
+----------------+ | | |
| +VarLengthExpand(All) | (p)-[anon_0:FRIENDS_WITH*..2]-(q) | 40 | 48 | 151 |
128 | | | |
| | +-----------------------------------+----------------+------+---------
+----------------+ | | |
| +NodeByLabelScan | p:Person | 14 | 14 | 15 |
120 | 6/0 | 10,457 | Fused in Pipeline 0 |
+-----------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 318, total allocated memory: 208

VarLength Expand Into


When both the start and end node have already been found, the VarLengthExpand(Into) operator is used
to find all variable-length and quantified relationships connecting the two nodes.

782
Example 434. VarLengthExpand(Into)

Query

PROFILE
MATCH (p:Person)-[:FRIENDS_WITH *1..2]-(p:Person)
RETURN p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits |
Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | p | 3 | 4 | 0 |
| | | |
| | +-----------------------------------+----------------+------+---------
+----------------+ | | |
| +VarLengthExpand(Into) | (p)-[anon_0:FRIENDS_WITH*..2]-(p) | 3 | 4 | 151 |
128 | | | |
| | +-----------------------------------+----------------+------+---------
+----------------+ | | |
| +NodeByLabelScan | p:Person | 14 | 14 | 15 |
120 | 6/0 | 0,797 | Fused in Pipeline 0 |
+------------------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 222, total allocated memory: 192

VarLength Expand Pruning


Given a start node, the VarLengthExpand(Pruning) operator will traverse variable-length and quantified
relationships much like the VarLengthExpand(All) operator. However, as an optimization, some paths will
not be explored if they are guaranteed to produce an end node that has already been found (by means of a
previous path traversal).

This kind of expand is only planned when:

• The individual paths are not of interest.

• The relationships have an upper bound.

The VarLengthExpand(Pruning) operator guarantees that all the end nodes produced will be unique.

783
Example 435. VarLengthExpand(Pruning)

Query

PROFILE
MATCH (p:Person)-[:FRIENDS_WITH *3..4]-(q:Person)
RETURN DISTINCT p, q

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------------+----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits |
Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by | Pipeline |
+---------------------------+----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------+
| +ProduceResults | 0 | p, q | 0 | 0 | 0 |
| 0/0 | 0.005 | | |
| | +----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+ | |
| +OrderedDistinct | 1 | p, q | 0 | 0 | 0 |
40 | 0/0 | 0.014 | | |
| | +----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+ | |
| +Filter | 2 | q:Person | 0 | 0 | 0 |
| 0/0 | 0.014 | | |
| | +----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+ | |
| +VarLengthExpand(Pruning) | 3 | (p)-[:FRIENDS_WITH*3..4]-(q) | 1 | 0 | 15 |
400 | | | | In Pipeline 1 |
| | +----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+ +---------------+
| +NodeByLabelScan | 4 | p:Person | 14 | 14 | 15 |
120 | 1/0 | 0.020 | p ASC | In Pipeline 0 |
+---------------------------+----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------+

Total database accesses: 30, total allocated memory: 480

Breadth First VarLength Expand Pruning


Given a start node, the VarLengthExpand(Pruning,BFS,All) operator traverses variable-length and
quantified relationships much like the VarLengthExpand(All) operator. However, as an optimization, it
instead performs a breadth-first search (BFS) and while expanding, some paths are not explored if they
are guaranteed to produce an end node that has already been found (by means of a previous path
traversal). This is only used in cases where the individual paths are not of interest.

This kind of expand is only planned when:

• The individual paths are not of interest.

• The lower bound is either 0 or 1 (default).

This operator guarantees that all the end nodes produced are unique.

784
Query

PROFILE
MATCH (p:Person)-[:FRIENDS_WITH *..4]-(q:Person)
RETURN DISTINCT p, q

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------------------+----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits
| Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by | Pipeline |
+-----------------------------------+----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------------+
| +ProduceResults | 0 | p, q | 12 | 0 | 0
| 0 | | | | |
| | +----+------------------------------+----------------+------+---------
+----------------+ | | | |
| +OrderedDistinct | 1 | p, q | 12 | 0 | 0
| 40 | | | | |
| | +----+------------------------------+----------------+------+---------
+----------------+ | | | |
| +Filter | 2 | q:Person | 13 | 0 | 0
| | | | | |
| | +----+------------------------------+----------------+------+---------
+----------------+ | | | |
| +VarLengthExpand(Pruning,BFS,All) | 3 | (p)-[:FRIENDS_WITH*..4]-(q) | 13 | 0 | 38
| 952 | | | | |
| | +----+------------------------------+----------------+------+---------
+----------------+ | | | |
| +NodeByLabelScan | 4 | p:Person | 10 | 10 | 11
| 248 | 3/0 | 4.662 | p ASC | Fused in Pipeline 0 |
+-----------------------------------+----+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------------+

Total database accesses: 49, total allocated memory: 1200

Repeat (Trail) Label—new 5.9


Given a start node, the Repeat(Trail) operator will traverse quantified path patterns that cannot be solved
(or solved efficiently) with the VarLengthExpand(All) operator. Similar to an Apply operator, it takes a
single row from the left-hand side and applies the operators on the right-hand side. In contrast to Apply,
however, it repeatedly applies these operators in accordance with the quantifiers on the quantified path
pattern. In the following example, the operator will repeat twice and produce rows for both repetitions.

785
Example 436. Repeat(Trail)

Query

PROFILE
MATCH (me:Person) ((a)-[:FRIENDS_WITH]-(b)-[:FRIENDS_WITH]-(c) WHERE a.name <> b.name AND a.name <>
c.name AND b.name <> c.name){1,2} (friend:Person)
RETURN me, friend

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+
| +ProduceResults | 0 | me, friend
| 2 | 34 | 136 | 0 | | |
|
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| +Filter | 1 | friend:Person
| 2 | 34 | 68 | | | |
|
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| +NullifyMetadata | 9 |
| 2 | 34 | 0 | | | |
|
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| +Repeat(Trail) | 2 | (me) (...){1, 2} (friend)
| 2 | 34 | 0 | 29792 | 0/0 | 1.696 | Fused in
Pipeline 2 |
| |\ +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+
| | +Filter | 3 | NOT anon_5 = anon_3 AND (NOT cache[a.name] = cache[c.name] AND NOT
cache[b.name] = cache[c.name]) AN | 1 | 34 | 92 | |
| | |
| | | | | D isRepeatTrailUnique(anon_5)
| | | | | | |
|
| | | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| | +Expand(All) | 4 | (b)-[anon_5:FRIENDS_WITH]-(c)
| 3 | 92 | 138 | | | |
|
| | | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |

786
|
| | +Filter | 5 | NOT cache[a.name] = cache[b.name] AND isRepeatTrailUnique(anon_3)
| 5 | 46 | 198 | | | |
|
| | | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| | +Expand(All) | 6 | (a)-[anon_3:FRIENDS_WITH]-(b)
| 10 | 66 | 100 | | | |
|
| | | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| | +Argument | 7 | a
| 15 | 34 | 0 | 15672 | 2/0 | 3.245 | Fused in
Pipeline 1 |
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+
| +NodeByLabelScan | 8 | me:Person
| 14 | 14 | 15 | 376 | 1/0 | 0.107 | In Pipeline
0 |
+------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+

Total database accesses: 747, total allocated memory: 45832

Nullify Metadata Label—new 5.9


NullifyMetadata is responsible for cleaning up the state produced by Repeat(Trail). It is only planned
directly after Repeat(Trail).

787
Example 437. NullifyMetadata

Query

PROFILE
MATCH (me:Person) ((a)-[:FRIENDS_WITH]-(b)-[:FRIENDS_WITH]-(c) WHERE a.name <> b.name AND a.name <>
c.name AND b.name <> c.name){1,2} (friend:Person)
RETURN me, friend

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+
| +ProduceResults | 0 | me, friend
| 2 | 34 | 136 | 0 | | |
|
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| +Filter | 1 | friend:Person
| 2 | 34 | 68 | | | |
|
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| +NullifyMetadata | 9 |
| 2 | 34 | 0 | | | |
|
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| +Repeat(Trail) | 2 | (me) (...){1, 2} (friend)
| 2 | 34 | 0 | 29792 | 0/0 | 1.696 | Fused in
Pipeline 2 |
| |\ +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+
| | +Filter | 3 | NOT anon_5 = anon_3 AND (NOT cache[a.name] = cache[c.name] AND NOT
cache[b.name] = cache[c.name]) AN | 1 | 34 | 92 | |
| | |
| | | | | D isRepeatTrailUnique(anon_5)
| | | | | | |
|
| | | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| | +Expand(All) | 4 | (b)-[anon_5:FRIENDS_WITH]-(c)
| 3 | 92 | 138 | | | |
|
| | | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |

788
|
| | +Filter | 5 | NOT cache[a.name] = cache[b.name] AND isRepeatTrailUnique(anon_3)
| 5 | 46 | 198 | | | |
|
| | | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| | +Expand(All) | 6 | (a)-[anon_3:FRIENDS_WITH]-(b)
| 10 | 66 | 100 | | | |
|
| | | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+ | |
|
| | +Argument | 7 | a
| 15 | 34 | 0 | 15672 | 2/0 | 3.245 | Fused in
Pipeline 1 |
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+
| +NodeByLabelScan | 8 | me:Person
| 14 | 14 | 15 | 376 | 1/0 | 0.107 | In Pipeline
0 |
+------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+----------------+------------------------+-----------+
---------------------+

Total database accesses: 747, total allocated memory: 45832

Assert Same Node


The AssertSameNode operator is used to ensure that no node property uniqueness constraints are violated
in the slotted and interpreted runtime. The example looks for the presence of a team node with the
supplied name and id, and if one does not exist, it will be created. Owing to the existence of two node
property uniqueness constraints on :Team(name) and :Team(id), any node that would be found by the
UniqueIndexSeek operator must be the very same node or the constraints would be violated.

789
Example 438. AssertSameNode

Query

PROFILE
CYPHER runtime=slotted
MERGE (t:Team {name: 'Engineering', id: 42})

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+---------------------------------+-------------------------------------------------------
+----------------+------+---------+------------------------+
| Operator | Details | Estimated
Rows | Rows | DB Hits | Page Cache Hits/Misses |
+---------------------------------+-------------------------------------------------------
+----------------+------+---------+------------------------+
| +ProduceResults | |
1 | 0 | 0 | 0/0 |
| | +-------------------------------------------------------
+----------------+------+---------+------------------------+
| +EmptyResult | |
1 | 0 | 0 | 0/0 |
| | +-------------------------------------------------------
+----------------+------+---------+------------------------+
| +Merge | CREATE (t:Team {name: $autostring_0, id: $autoint_1}) |
1 | 1 | 0 | 0/0 |
| | +-------------------------------------------------------
+----------------+------+---------+------------------------+
| +AssertSameNode | t |
0 | 1 | 0 | 0/0 |
| |\ +-------------------------------------------------------
+----------------+------+---------+------------------------+
| | +NodeUniqueIndexSeek(Locking) | UNIQUE t:Team(id) WHERE id = $autoint_1 |
1 | 1 | 1 | 0/1 |
| | +-------------------------------------------------------
+----------------+------+---------+------------------------+
| +NodeUniqueIndexSeek(Locking) | UNIQUE t:Team(name) WHERE name = $autostring_0 |
1 | 1 | 1 | 0/1 |
+---------------------------------+-------------------------------------------------------
+----------------+------+---------+------------------------+

Total database accesses: 2, total allocated memory: 64

Assert Same Relationship Label—new 5.8


The AssertSameRelationship operator is used to ensure that no relationship property uniqueness
constraints are violated in the slotted and interpreted runtime. The example looks for the presence of a
WORKS_IN relationship with the supplied id and badgeNumber. If it can’t be found, then it will be created.
Owing to the existence of two property uniqueness constraints on :WORKS_IN(id) and
:WORKS_IN(badgeNumber), any relationship that would be found by the
DirectedRelationshipUniqueIndexSeek operator must be the very same relationship or the constraints
would be violated.

790
Example 439. AssertSameRelationship

Query

PROFILE
CYPHER runtime=slotted
MERGE (person)-[work:WORKS_IN {id: 0, badgeNumber: 4332}]->(location)

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-------------------------------------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+------------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses |
+-------------------------------------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+------------------------+
| +ProduceResults | 0 |
| 1 | 0 | 0 | 0/0 |
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+------------------------+
| +EmptyResult | 1 |
| 1 | 0 | 0 | 0/0 |
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+------------------------+
| +Merge | 2 | CREATE (person), (location), (person)-
[work:WORKS_IN {id: $autoint_0, badgeNumber: $autoint_1}]->(lo | 1 | 1 | 0 |
0/0 |
| | | | cation)
| | | | |
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+------------------------+
| +AssertSameRelationship | 3 | work
| 0 | 1 | 0 | 0/0 |
| |\ +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+------------------------+
| | +DirectedRelationshipUniqueIndexSeek(Locking) | 4 | RANGE INDEX (person)-
[work:WORKS_IN(badgeNumber)]->(location) WHERE badgeNumber = $autoint_1 | 1 |
1 | 1 | 0/1 |
| | +----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+------------------------+
| +DirectedRelationshipUniqueIndexSeek(Locking) | 5 | RANGE INDEX (person)-[work:WORKS_IN(id)]-
>(location) WHERE id = $autoint_0 | 1 | 1 | 1 |
1/1 |
+-------------------------------------------------+----
+----------------------------------------------------------------------------------------------------
--+----------------+------+---------+------------------------+

Total database accesses: 2, total allocated memory: 64

Empty Result
The EmptyResult operator eagerly loads all incoming data and discards it.

791
Example 440. EmptyResult

Query

PROFILE
CREATE (:Person)

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+-----------------+----------------+------+---------+------------------------
+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time
(ms) | Pipeline |
+-----------------+-----------------+----------------+------+---------+------------------------
+-----------+---------------------+
| +ProduceResults | | 1 | 0 | 0 | |
| |
| | +-----------------+----------------+------+---------+ |
| |
| +EmptyResult | | 1 | 0 | 0 | |
| |
| | +-----------------+----------------+------+---------+ |
| |
| +Create | (anon_0:Person) | 1 | 1 | 1 | 0/0 |
0.000 | Fused in Pipeline 0 |
+-----------------+-----------------+----------------+------+---------+------------------------
+-----------+---------------------+

Total database accesses: 1, total allocated memory: 184

Produce Results
The ProduceResults operator prepares the result so that it is consumable by the user, such as transforming
internal values to user values. It is present in every single query that returns data to the user, and has little
bearing on performance optimisation.

792
Example 441. ProduceResults

Query

PROFILE
MATCH (n)
RETURN n

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | n | 35 | 35 | 0 | |
| | |
| | +---------+----------------+------+---------+----------------+
| | |
| +AllNodesScan | n | 35 | 35 | 36 | 120 |
3/0 | 0.508 | Fused in Pipeline 0 |
+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 36, total allocated memory: 184

Load CSV
The LoadCSV operator loads data from a CSV source into the query. It is used whenever the LOAD CSV
clause is used in a query.

793
Example 442. LoadCSV

Query

PROFILE
LOAD CSV FROM 'https://neo4j.com/docs/cypher-refcard/3.3/csv/artists.csv' AS line
RETURN line

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------+
| +ProduceResults | line | 10 | 4 | 0 | |
0/0 | 0.210 | |
| | +---------+----------------+------+---------+----------------
+------------------------+-----------+ |
| +LoadCSV | line | 10 | 4 | 0 | 72 |
| | In Pipeline 1 |
+-----------------+---------+----------------+------+---------+----------------
+------------------------+-----------+---------------+

Total database accesses: 0, total allocated memory: 184

Hash joins in general


Hash joins have two inputs: the build input and probe input. The query planner assigns these roles so that
the smaller of the two inputs is the build input. The build input is pulled in eagerly, and is used to build a
probe table. Once this is complete, the probe table is checked for each row coming from the probe input
side.

In query plans, the build input is always the left operator, and the probe input the right operator.

There are four hash join operators:

• NodeHashJoin

• ValueHashJoin

• NodeLeftOuterHashJoin

• NodeRightOuterHashJoin

Node Hash Join


The NodeHashJoin operator is a variation of the hash join. NodeHashJoin executes the hash join on node ids.
As primitive types and arrays can be used, it can be done very efficiently.

794
Example 443. NodeHashJoin

Query

PROFILE
MATCH (bob:Person {name: 'Bob'})-[:WORKS_IN]->(loc)<-[:WORKS_IN]-(matt:Person {name: 'Mattias'})
USING JOIN ON loc
RETURN loc.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows
| DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | `loc.name` | 10 | 0
| 0 | | 0/0 | 0.000 | |
| | +----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +Projection | loc.name AS `loc.name` | 10 | 0
| 0 | | 0/0 | 0.000 | |
| | +----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +Filter | not anon_0 = anon_1 | 10 | 0
| 0 | | 0/0 | 0.000 | |
| | +----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +NodeHashJoin | loc | 10 | 0
| 0 | 3688 | | 0.053 | In Pipeline 2 |
| |\ +----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| | +Expand(All) | (matt)-[anon_1:WORKS_IN]->(loc) | 19 | 0
| 0 | | | | |
| | | +----------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +NodeIndexSeek | RANGE INDEX matt:Person(name) WHERE name = $autostring_1 | 1 | 0
| 1 | 120 | 1/0 | 0.288 | Fused in Pipeline 1 |
| | +----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +Expand(All) | (bob)-[anon_0:WORKS_IN]->(loc) | 19 | 1
| 4 | | | | |
| | +----------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeIndexSeek | RANGE INDEX bob:Person(name) WHERE name = $autostring_0 | 1 | 1
| 2 | 120 | 3/0 | 0.556 | Fused in Pipeline 0 |
+------------------+----------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 7, total allocated memory: 3888

Value Hash Join


The ValueHashJoin operator is a variation of the hash join. This operator allows for arbitrary values to be
used as the join key. It is most frequently used to solve predicates of the form: n.prop1 = m.prop2 (i.e.
equality predicates between two property columns).

795
Example 444. ValueHashJoin

Query

PROFILE
MATCH
(p:Person),
(q:Person)
WHERE p.age = q.age
RETURN p, q

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-------------------+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | p, q | 10 | 0 | 0 | |
0/0 | 0.000 | |
| | +---------------+----------------+------+---------+----------------
+------------------------+-----------+ |
| +ValueHashJoin | p.age = q.age| 10 | 0 | 0 | 344 |
| | In Pipeline 2 |
| |\ +---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| | +NodeByLabelScan| q:Person | 15 | 0 | 0 | 120 |
0/0 | 0,000 | In Pipeline 1 |
| | +---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +NodeByLabelScan | p:Person | 15 | 15 | 16 | 120 |
1/0 | 0,211 | In Pipeline 0 |
+-------------------+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 71, total allocated memory: 664

Node Left/Right Outer Hash Join


The NodeLeftOuterHashJoin and NodeRightOuterHashJoin operators are variations of the hash join. The
query below can be planned with either a left or a right outer join. The decision depends on the
cardinalities of the left-hand and right-hand sides; i.e. how many rows would be returned, respectively, for
(a:Person) and (a)-→(b:Person). If (a:Person) returns fewer results than (a)-→(b:Person), a left outer
join — indicated by NodeLeftOuterHashJoin — is planned. On the other hand, if (a:Person) returns more
results than (a)-→(b:Person), a right outer join — indicated by NodeRightOuterHashJoin — is planned
instead.

796
Example 445. NodeRightOuterHashJoin

Query

PROFILE
MATCH (a:Person)
OPTIONAL MATCH (a)-->(b:Person)
USING JOIN ON a
RETURN a.name, b.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------------+------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------------+------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | `a.name`, `b.name` | 14 |
16 | 0 | | 0/0 | 0.102 | |
| | +------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +Projection | cache[a.name] AS `a.name`, cache[b.name] AS `b.name` | 14 |
16 | 8 | | 0/0 | 0.055 | |
| | +------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +NodeRightOuterHashJoin | a | 14 |
16 | 0 | 4232 | | 0.269 | In Pipeline 2 |
| |\ +------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| | +NodeByLabelScan | a:Person | 15 |
15 | 16 | 120 | 1/0 | 0,049 | In Pipeline 1 |
| | +------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +CacheProperties | cache[b.name], cache[a.name] | 13 |
13 | 39 | | | | |
| | +------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Expand(All) | (b)<-[anon_0]-(a) | 13 |
13 | 55 | | | | |
| | +------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeByLabelScan | b:Person | 15 |
15 | 16 | 120 | 5/0 | 1,150 | Fused in Pipeline 0 |
+-------------------------+------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 211, total allocated memory: 4312

Triadic Selection
The TriadicSelection operator is used to solve triangular queries, such as the very common 'find my
friends-of-friends that are not already my friend'. It does so by putting all the friends into a set, and uses
the set to check if the friends-of-friends are already connected to me. The example finds the names of all
friends of my friends that are not already my friends.

797
Example 446. TriadicSelection

Query

PROFILE
CYPHER runtime=slotted
MATCH (me:Person)-[:FRIENDS_WITH]-()-[:FRIENDS_WITH]-(other)
WHERE NOT (me)-[:FRIENDS_WITH]-(other)
RETURN other.name

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-------------------+----------------------------------------+----------------+------+---------
+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Page
Cache Hits/Misses |
+-------------------+----------------------------------------+----------------+------+---------
+------------------------+
| +ProduceResults | `other.name` | 4 | 24 | 0 |
0/0 |
| | +----------------------------------------+----------------+------+---------
+------------------------+
| +Projection | other.name AS `other.name` | 4 | 24 | 24 |
1/0 |
| | +----------------------------------------+----------------+------+---------
+------------------------+
| +Filter | not anon_2 = anon_4 | 16 | 24 | 0 |
0/0 |
| | +----------------------------------------+----------------+------+---------
+------------------------+
| +TriadicSelection | WHERE NOT (me)--(other) | 4 | 24 | 0 |
0/0 |
| |\ +----------------------------------------+----------------+------+---------
+------------------------+
| | | +----------------------------------------+----------------+------+---------
+------------------------+
| | +Expand(All) | (anon_3)-[anon_4:FRIENDS_WITH]-(other) | 16 | 48 | 98 |
48/0 |
| | | +----------------------------------------+----------------+------+---------
+------------------------+
| | +Argument | anon_3, anon_2 | 24 | 24 | 0 |
0/0 |
| | +----------------------------------------+----------------+------+---------
+------------------------+
| +Expand(All) | (me)-[anon_2:FRIENDS_WITH]-(anon_3) | 24 | 24 | 53 |
28/0 |
| | +----------------------------------------+----------------+------+---------
+------------------------+
| +NodeByLabelScan | me:Person | 15 | 15 | 16 |
1/0 |
+-------------------+----------------------------------------+----------------+------+---------
+------------------------+

Total database accesses: 246, total allocated memory: 64

Triadic Build
The TriadicBuild operator is used in conjunction with TriadicFilter to solve triangular queries, such as
the very common 'find my friend-of-friends that are not already my friend'. These two operators are
specific to Pipelined runtime and together perform the same logic as TriadicSelection does for other
runtimes. TriadicBuild builds a set of all friends, which is later used by TriadicFilter. The example finds

798
the names of all friends of my friends that are not already my friends.

799
Example 447. TriadicBuild

Query

PROFILE
CYPHER runtime=pipelined
MATCH (me:Person)-[:FRIENDS_WITH]-()-[:FRIENDS_WITH]-(other)
WHERE NOT (me)-[:FRIENDS_WITH]-(other)
RETURN other.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | `other.name` | 4 | 24 | 0 |
| 0/0 | 0.133 | |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+ |
| +Projection | other.name AS `other.name` | 4 | 24 | 48 |
| 2/0 | 0.056 | |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +Filter | not anon_2 = anon_4 | 16 | 24 | 0 |
| | | |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +TriadicFilter | WHERE NOT (me)--(other) | 4 | 24 | 0 |
4136 | 0/0 | 0.195 | In Pipeline 3 |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +Apply | | 16 | 24 | 0 |
| 0/0 | | |
| |\ +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| | | +----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Expand(All) | (anon_3)-[anon_4:FRIENDS_WITH]-(other) | 16 | 48 | 98 |
| | | |
| | | +----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Argument | anon_3, anon_2 | 24 | 24 | 0 |
4200 | 0/0 | 0.397 | Fused in Pipeline 2 |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +TriadicBuild | (me)--(anon_3) | 24 | 24 | 0 |
888 | 0/0 | 1.427 | In Pipeline 1 |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +Expand(All) | (me)-[anon_2:FRIENDS_WITH]-(anon_3) | 24 | 24 | 39 |
| | | |
| | +----------------------------------------+----------------+------+---------
+----------------+ | | |
| +NodeByLabelScan| me:Person | 15 | 15 | 16 |
120 | 3/0 | 0,200 | Fused in Pipeline 0 |
+-----------------+----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 256, total allocated memory: 7376

800
Triadic Filter
The TriadicFilter operator is used in conjunction with TriadicBuild to solve triangular queries, such as
the very common 'find my friend-of-friends that are not already my friend'. These two operators are
specific to Pipelined runtime and together perform the same logic as TriadicSelection does for other
runtimes. TriadicFilter uses a set of friends previously built by TriadicBuild to check if the friend-of-
friends are already connected to me. The example finds the names of all friends of my friends that are not
already my friends.

801
Example 448. TriadicFilter

Query

PROFILE
CYPHER runtime=pipelined
MATCH (me:Person)-[:FRIENDS_WITH]-()-[:FRIENDS_WITH]-(other)
WHERE NOT (me)-[:FRIENDS_WITH]-(other)
RETURN other.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | `other.name` | 4 | 24 | 0 |
| 0/0 | 0.189 | |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+ |
| +Projection | other.name AS `other.name` | 4 | 24 | 48 |
| 2/0 | 0.381 | |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +Filter | not anon_2 = anon_4 | 16 | 24 | 0 |
| | | |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +TriadicFilter | WHERE NOT (me)--(other) | 4 | 24 | 0 |
4136 | 0/0 | 0.685 | In Pipeline 3 |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +Apply | | 16 | 24 | 0 |
| 0/0 | | |
| |\ +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| | | +----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Expand(All) | (anon_3)-[anon_4:FRIENDS_WITH]-(other) | 16 | 48 | 98 |
| | | |
| | | +----------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Argument | anon_3, anon_2 | 24 | 24 | 0 |
4200 | 0/0 | 0.496 | Fused in Pipeline 2 |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +TriadicBuild | (me)--(anon_3) | 24 | 24 | 0 |
888 | 0/0 | 3.268 | In Pipeline 1 |
| | +----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +Expand(All) | (me)-[anon_2:FRIENDS_WITH]-(anon_3) | 24 | 24 | 39 |
| | | |
| | +----------------------------------------+----------------+------+---------
+----------------+ | | |
| +NodeByLabelScan| me:Person | 15 | 15 | 16 |
120 | 3/0 | 0,481 | Fused in Pipeline 0 |
+-----------------+----------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 256, total allocated memory: 7376

802
Cartesian Product
The CartesianProduct operator produces a cartesian product of the two inputs — each row coming from
the left child operator will be combined with all the rows from the right child operator. CartesianProduct
generally exhibits bad performance and ought to be avoided if possible.

Example 449. CartesianProduct

Query

PROFILE
MATCH
(p:Person),
(t:Team)
RETURN p, t

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+--------------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | p, t | 140 | 140 | 0 | |
2/0 | 1.917 | |
| | +----------+----------------+------+---------+----------------
+------------------------+-----------+ |
| +CartesianProduct | | 140 | 140 | 0 | 1736 |
| 1.209 | In Pipeline 2 |
| |\ +----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| | +NodeByLabelScan | t:Team | 10 | 10 | 11 | 136 |
1/0 | 1,145 | In Pipeline 1 |
| | +----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +NodeByLabelScan | p:Person | 15 | 15 | 16 | 120 |
1/0 | 0,409 | In Pipeline 0 |
+--------------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 142, total allocated memory: 1816

Foreach
The Foreach operator executes a nested loop between the left child operator and the right child operator.
In an analogous manner to the Apply operator, it takes a row from the left-hand side and, using the
Argument operator, provides it to the operator tree on the right-hand side. Foreach will yield all the rows
coming in from the left-hand side; all results from the right-hand side are pulled in and discarded.

803
Example 450. Foreach

Query

PROFILE
FOREACH (value IN [1,2,3] | CREATE (:Person {age: value}))

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------+---------------------------------------------------------+----------------+------
+---------+------------------------+
| Operator | Details | Estimated Rows | Rows |
DB Hits | Page Cache Hits/Misses |
+-----------------+---------------------------------------------------------+----------------+------
+---------+------------------------+
| +ProduceResults | | 1 | 0 |
0 | 0/0 |
| | +---------------------------------------------------------+----------------+------
+---------+------------------------+
| +EmptyResult | | 1 | 0 |
0 | 0/0 |
| | +---------------------------------------------------------+----------------+------
+---------+------------------------+
| +Foreach | value IN [1, 2, 3], CREATE (anon_0:Person {age: value}) | 1 | 1 |
9 | 0/0 |
+-----------------+---------------------------------------------------------+----------------+------
+---------+------------------------+

Total database accesses: 9, total allocated memory: 64

TransactionForeach
TransactionForeach works like the Foreach operator but will commit the current transaction after a
specified number of rows.

804
Example 451. TransactionForeach

The below query uses a variable scope clause (introduced in Neo4j 5.23) to import

 variables into the CALL subquery. If you are using an older version of Neo4j, use an
importing WITH clause instead.

Query

PROFILE
LOAD CSV FROM 'https://neo4j.com/docs/cypher-refcard/3.3/csv/artists.csv' AS line
CALL (line) {
CREATE (a: Artist {name: line[0]})
} IN TRANSACTIONS OF 100 ROWS

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

++---------------------+----+--------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------+----+--------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | | 10 |
0 | 0 | 0 | | | |
| | +----+--------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +EmptyResult | 1 | | 10 |
0 | 0 | | 0/0 | 0.000 | Fused in Pipeline 3 |
| | +----+--------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +TransactionForeach | 2 | IN TRANSACTIONS OF $autoint_1 ROWS ON ERROR FAIL | 10 |
4 | 0 | 4856 | | | |
| |\ +----+--------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Create | 3 | (a:Artist {name: line[$autoint_0]}) | 10 |
4 | 12 | | | | |
| | | +----+--------------------------------------------------+----------------
+------+---------+----------------+ | | |
| | +Argument | 4 | line | 10 |
4 | 0 | 3472 | 0/0 | 0.712 | Fused in Pipeline 2 |
| | +----+--------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +LoadCSV | 5 | line | 10 |
4 | 0 | 328 | | | In Pipeline 1 |
+---------------------+----+--------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 12, total allocated memory: 5704

SubqueryForeach
SubqueryForeach works like the Foreachoperator but it is only used for executing subqueries.

805
Example 452. SubqueryForeach

The below query uses a variable scope clause (introduced in Neo4j 5.23) to import

 variables into the CALL subquery. If you are using an older version of Neo4j, use an
importing WITH clause instead.

Query

PROFILE
LOAD CSV FROM 'https://neo4j.com/docs/cypher-refcard/3.3/csv/artists.csv' AS line
CALL (line) {
CREATE (a: Artist {name: line[0]})
}

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----+-------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits |
Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+----+-------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | | 10 | 0 | 0 |
0 | | | |
| | +----+-------------------------------------+----------------+------+---------
+----------------+ | | |
| +EmptyResult | 1 | | 10 | 0 | 0 |
| 0/0 | 0.000 | Fused in Pipeline 3 |
| | +----+-------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +SubqueryForeach | 2 | | 10 | 4 | 0 |
4080 | | | |
| |\ +----+-------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Create | 3 | (a:Artist {name: line[$autoint_0]}) | 10 | 4 | 12 |
| | | |
| | | +----+-------------------------------------+----------------+------+---------
+----------------+ | | |
| | +Argument | 4 | line | 10 | 4 | 0 |
3472 | 0/0 | 0.852 | Fused in Pipeline 2 |
| | +----+-------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +LoadCSV | 5 | line | 10 | 4 | 0 |
328 | | | In Pipeline 1 |
+------------------+----+-------------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 12, total allocated memory: 4928

Eager
The Eager operator causes all preceding operators to execute fully, for the whole dataset, before
continuing execution. This is done to ensure isolation between parts of the query plan that might
otherwise affect each other.

Values from the graph are fetched in a lazy manner; i.e. a pattern matching might not be fully exhausted

806
before updates are applied. To maintain correct semantics, the query planner will insert Eager operators
into the query plan to prevent updates from influencing pattern matching, or other read operations. This
scenario is exemplified by the query below, where the DELETE clause would otherwise influence both the
MATCH clause and the MERGE clause. For more information on how the Eager operator can ensure correct
semantics, see the section on Clause composition.

The Eager operator can cause high memory usage when importing data or migrating graph structures. In
such cases, the operations should be split into simpler steps; e.g. importing nodes and relationships
separately. Alternatively, the records to be updated can be returned, followed by an update statement.

807
Example 453. Eager

Query

PROFILE
MATCH (a:Person {name: 'me'}), (b:Person {name: 'Bob'})
DETACH DELETE a, b
MERGE (:Person {name: 'me'})

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------+----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+---------------------+----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 |
| 0 | 0 | 0 | 0 | | |
|
| | +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +EmptyResult | 1 |
| 0 | 0 | 0 | | | |
|
| | +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Apply | 2 |
| 0 | 1 | 0 | | | |
|
| |\ +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| | +Merge | 3 | CREATE (anon_0:Person {name: $autostring_2})
| 0 | 1 | 3 | | | |
|
| | | +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| | +NodeIndexSeek | 4 | RANGE INDEX anon_0:Person(name) WHERE name = $autostring_2
| 0 | 0 | 1 | 3304 | 1/0 | 0.663 | Fused in
Pipeline 3 |
| | +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +Eager | 5 | read/delete conflict for variable: anon_0 (Operator: 6 vs 4, and 1 more
conflicting operators) | 0 | 1 | 0 | 360 | 0/0 |
0.008 | In Pipeline 2 |
| | +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +DetachDelete | 6 | b
| 0 | 1 | 4 | | | |

808
|
| | +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +DetachDelete | 7 | a
| 0 | 1 | 5 | | | |
|
| | +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Eager | 8 | read/delete conflict for variable: b (Operator: 6 vs 10, and 1 more
conflicting operators), | 0 | 1 | 0 | 360 |
1/0 | 0.226 | Fused in Pipeline 1 |
| | | | read/set conflict for label: Person (Operator: 3 vs 10),
| | | | | | |
|
| | | | read/set conflict for property: name (Operator: 3 vs 10)
| | | | | | |
|
| | +----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +MultiNodeIndexSeek | 9 | RANGE INDEX a:Person(name) WHERE name = $autostring_0,
| 0 | 1 | 4 | 376 | 2/0 | 0.218 | In Pipeline
0 |
| | | RANGE INDEX b:Person(name) WHERE name = $autostring_1
| | | | | | |
|
+---------------------+----
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 17, total allocated memory: 4184

Eager Aggregation
The EagerAggregation operator evaluates a grouping expression and uses the result to group rows into
different groupings. For each of these groupings, EagerAggregation will then evaluate all aggregation
functions and return the result. To do this, EagerAggregation, as the name implies, needs to pull in all data
eagerly from its source and build up state, which leads to increased memory pressure in the system.

809
Example 454. EagerAggregation

Query

PROFILE
MATCH (l:Location)<-[:WORKS_IN]-(p:Person)
RETURN
l.name AS location,
collect(p.name) AS people

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------+----+------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | location, people | 4 |
6 | 0 | | 0/0 | 0.022 | In Pipeline 1 |
| | +----+------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | cache[l.name] AS location, collect(p.name) AS people | 4 |
6 | 30 | 2584 | | | |
| | +----+------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Filter | 2 | p:Person | 15 |
15 | 30 | | | | |
| | +----+------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Expand(All) | 3 | (l)<-[anon_0:WORKS_IN]-(p) | 15 |
15 | 26 | | | | |
| | +----+------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +CacheProperties | 4 | cache[l.name] | 10 |
10 | 20 | | | | |
| | +----+------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeByLabelScan | 5 | l:Location | 10 |
10 | 11 | 120 | 4/0 | 0.813 | Fused in Pipeline 0 |
+-------------------+----+------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 117, total allocated memory: 2664

Ordered Aggregation
The OrderedAggregation operator is an optimization of the EagerAggregation operator that takes
advantage of the ordering of the incoming rows. This operator uses lazy evaluation and has a lower
memory pressure in the system than the EagerAggregation operator.

810
Example 455. OrderedAggregation

Query

PROFILE
MATCH (p:Person)
WHERE p.name STARTS WITH 'P'
RETURN p.name, count(*) AS count

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+--------------
+---------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by
| Pipeline |
+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+--------------
+---------------+
| +ProduceResults | `p.name`, count
| 0 | 2 | 0 | | 0/0 | 0.045 |
| |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
|
| +OrderedAggregation | cache[p.name] AS `p.name`, count(*) AS count
| 0 | 2 | 0 | 288 | 0/0 | 0.175 | `p.name`
ASC | In Pipeline 1 |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+--------------
+---------------+
| +NodeIndexSeekByRange | RANGE INDEX p:Person(name) WHERE name STARTS WITH $autostring_0,
cache[p.name] | 0 | 2 | 3 | 120 | 0/1 | 0.529
| p.name ASC | In Pipeline 0 |
+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+--------------
+---------------+

Total database accesses: 3, total allocated memory: 352

Node Count From Count Store


The NodeCountFromCountStore operator uses the count store to answer questions about node counts. This
is much faster than the EagerAggregation operator which achieves the same result by actually counting.
However, as the count store only stores a limited range of combinations, EagerAggregation will still be
used for more complex queries. For example, we can get counts for all nodes, and nodes with a label, but
not nodes with more than one label.

811
Example 456. NodeCountFromCountStore

Query

PROFILE
MATCH (p:Person)
RETURN count(p) AS people

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------------+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------------+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | people | 1 | 1 | 0 |
| | | |
| | +------------------------------+----------------+------+---------
+----------------+ | | |
| +NodeCountFromCountStore | count( (:Person) ) AS people | 1 | 1 | 1 |
120 | 0/0 | 0.169 | Fused in Pipeline 0 |
+--------------------------+------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 1, total allocated memory: 184

Relationship Count From Count Store


The RelationshipCountFromCountStore operator uses the count store to answer questions about
relationship counts. This is much faster than the EagerAggregation operator which achieves the same
result by actually counting. However, as the count store only stores a limited range of combinations,
EagerAggregation will still be used for more complex queries. For example, we can get counts for all
relationships, relationships with a type, relationships with a label on one end, but not relationships with
labels on both end nodes.

812
Example 457. RelationshipCountFromCountStore

Query

PROFILE
MATCH (p:Person)-[r:WORKS_IN]->()
RETURN count(r) AS jobs

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------------------+--------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+----------------------------------+--------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | jobs | 1 |
1 | 0 | | | | |
| | +--------------------------------------------+----------------
+------+---------+----------------+ | | |
| +RelationshipCountFromCountStore | count( (:Person)-[:WORKS_IN]->() ) AS jobs | 1 |
1 | 1 | 120 | 0/0 | 0.625 | Fused in Pipeline 0 |
+----------------------------------+--------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 1, total allocated memory: 184

Distinct
The Distinct operator removes duplicate rows from the incoming stream of rows. To ensure only distinct
elements are returned, Distinct will pull in data lazily from its source and build up state. This may lead to
increased memory pressure in the system.

813
Example 458. Distinct

Query

PROFILE
MATCH (l:Location)<-[:WORKS_IN]-(p:Person)
RETURN DISTINCT p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----+-----------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) |
Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+----+-----------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | p | 14 | 14 | 28 | |
| | |
| | +----+-----------------------+----------------+------+---------+----------------+
| | |
| +Distinct | 1 | p | 14 | 14 | 0 | 352 |
| | |
| | +----+-----------------------+----------------+------+---------+----------------+
| | |
| +Filter | 2 | p:Person | 15 | 15 | 30 | |
| | |
| | +----+-----------------------+----------------+------+---------+----------------+
| | |
| +Expand(All) | 3 | (l)<-[r:WORKS_IN]-(p) | 15 | 15 | 26 | |
| | |
| | +----+-----------------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | 4 | l:Location | 10 | 10 | 11 | 120 |
4/0 | 0.287 | Fused in Pipeline 0 |
+------------------+----+-----------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 95, total allocated memory: 432

Ordered Distinct
The OrderedDistinct operator is an optimization of the Distinct operator that takes advantage of the
ordering of the incoming rows. This operator has a lower memory pressure in the system than the
Distinct operator.

814
Example 459. OrderedDistinct

Query

PROFILE
MATCH (p:Person)
WHERE p.name STARTS WITH 'P'
RETURN DISTINCT p.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+--------------
+---------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by
| Pipeline |
+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+--------------
+---------------+
| +ProduceResults | `p.name`
| 0 | 2 | 0 | | 0/0 | 0.046 |
| |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
|
| +OrderedDistinct | cache[p.name] AS `p.name`
| 0 | 2 | 0 | 32 | 0/0 | 0.090 | `p.name`
ASC | |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+--------------+
|
| +NodeIndexSeekByRange | RANGE INDEX p:Person(name) WHERE name STARTS WITH $autostring_0,
cache[p.name] | 0 | 2 | 3 | 120 | 0/1 | 0.493
| p.name ASC | In Pipeline 0 |
+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+--------------
+---------------+

Total database accesses: 3, total allocated memory: 184

Filter
The Filter operator filters each row coming from the child operator, only passing through rows that
evaluate the predicates to true.

815
Example 460. Filter

Query

PROFILE
MATCH (p:Person)
WHERE p.name =~ '^a.*'
RETURN p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details | Estimated Rows
| Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | p | 14
| 0 | 0 | | | | |
| | +------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +Filter | cache[p.name] =~ $autostring_0 | 14
| 0 | 0 | | | | |
| | +------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexScan | RANGE INDEX p:Person(name) WHERE name IS NOT NULL, cache[p.name] | 14
| 14 | 15 | 120 | 0/1 | 0.763 | Fused in Pipeline 0 |
+-----------------+------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 15, total allocated memory: 184

Limit
The Limit operator returns the first n rows from the incoming input.

816
Example 461. Limit

Query

PROFILE
MATCH (p:Person)
RETURN p
LIMIT 3

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | p | 3 | 3 | 0 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +Limit | 3 | 3 | 3 | 0 | 32 |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan| p:Person | 3 | 4 | 5 | 120 |
3/0 | 0,540 | Fused in Pipeline 0 |
+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 8, total allocated memory: 184

Skip
The Skip operator skips n rows from the incoming rows.

817
Example 462. Skip

Query

PROFILE
MATCH (p:Person)
RETURN p
ORDER BY p.id
SKIP 1

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Ordered by | Pipeline |
+------------------+----------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| +ProduceResults | p | 13 | 13 | 0 | |
2/0 | 0.165 | | |
| | +----------------+----------------+------+---------+----------------
+------------------------+-----------+ | |
| +Skip | $autoint_0 | 13 | 13 | 0 | 32 |
0/0 | 0.043 | | |
| | +----------------+----------------+------+---------+----------------
+------------------------+-----------+ | |
| +Sort | `p.id` ASC | 14 | 14 | 0 | 400 |
0/0 | 0.155 | p.id ASC | In Pipeline 1 |
| | +----------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| +Projection | p.id AS `p.id` | 14 | 14 | 0 | |
| | | |
| | +----------------+----------------+------+---------+----------------+
| +------------+ |
| +NodeByLabelScan | p:Person | 18 | 18 | 19 | 120 |
3/0 | 0,157 | | Fused in Pipeline 0 |
+------------------+----------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+

Total database accesses: 71, total allocated memory: 512

Sort
The Sort operator sorts rows by a provided key. In order to sort the data, all data from the source operator
needs to be pulled in eagerly and kept in the query state, which will lead to increased memory pressure in
the system.

818
Example 463. Sort

Query

PROFILE
MATCH (p:Person)
RETURN p
ORDER BY p.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+--------------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page
Cache Hits/Misses | Time (ms) | Ordered by | Pipeline |
+------------------+--------------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| +ProduceResults | p | 14 | 14 | 0 | |
2/0 | 0.178 | | |
| | +--------------------+----------------+------+---------+----------------
+------------------------+-----------+ | |
| +Sort | `p.name` ASC | 14 | 14 | 0 | 1192 |
0/0 | 0.107 | p.name ASC | In Pipeline 1 |
| | +--------------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| +Projection | p.name AS `p.name` | 14 | 14 | 14 | |
| | | |
| | +--------------------+----------------+------+---------+----------------+
| +------------+ |
| +NodeByLabelScan |p:Person | 14 | 14 | 35 | 120 |
3/0 | 0,221 | | Fused in Pipeline 0 |
+------------------+--------------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+

Total database accesses: 85, total allocated memory: 1272

Partial Sort
The PartialSort operator is an optimization of the Sort operator that takes advantage of the ordering of
the incoming rows. This operator uses lazy evaluation and has a lower memory pressure in the system
than the Sort operator. Partial sort is only applicable when sorting on multiple columns.

819
Example 464. PartialSort

Query

PROFILE
MATCH (p:Person)
WHERE p.name STARTS WITH 'P'
RETURN p
ORDER BY p.name, p.age

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+-----------------------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by
| Pipeline |
+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+-----------------------
+---------------------+
| +ProduceResults | p
| 0 | 2 | 0 | | 2/0 | 0.087 |
| |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
|
| +PartialSort | `p.name` ASC, `p.age` ASC
| 0 | 2 | 0 | 544 | 0/0 | 0.184 | p.name ASC,
p.age ASC | In Pipeline 1 |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+-----------------------
+---------------------+
| +Projection | cache[p.name] AS `p.name`, p.age AS `p.age`
| 0 | 2 | 0 | | | | `p.name`
ASC | |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+ | +-----------------------+
|
| +NodeIndexSeekByRange | RANGE INDEX p:Person(name) WHERE name STARTS WITH $autostring_0,
cache[p.name] | 0 | 2 | 3 | 120 | 0/1 | 0.362
| p.name ASC | Fused in Pipeline 0 |
+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+-----------------------
+---------------------+

Total database accesses: 3, total allocated memory: 608

Top
The Top operator returns the first n rows sorted by a provided key. Instead of sorting the entire input, only
the top n rows are retained.

820
Example 465. Top

Query

PROFILE
MATCH (p:Person)
RETURN p
ORDER BY p.name
LIMIT 2

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----------------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page
Cache Hits/Misses | Time (ms) | Ordered by | Pipeline |
+------------------+----------------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| +ProduceResults | p | 2 | 2 | 0 | |
2/0 | 0.093 | | |
| | +----------------------+----------------+------+---------+----------------
+------------------------+-----------+ | |
| +Top | `p.name` ASC LIMIT 2 | 2 | 2 | 0 | 1184 |
0/0 | 0.295 | p.name ASC | In Pipeline 1 |
| | +----------------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+
| +Projection | p.name AS `p.name` | 14 | 14 | 14 | |
| | | |
| | +----------------------+----------------+------+---------+----------------+
| +------------+ |
| +NodeByLabelScan | p:Person | 14 | 14 | 35 | 120 |
3/0 | 0,166 | | Fused in Pipeline 0 |
+------------------+----------------------+----------------+------+---------+----------------
+------------------------+-----------+------------+---------------------+

Total database accesses: 85, total allocated memory: 1264

Partial Top
The PartialTop operator is an optimization of the Top operator that takes advantage of the ordering of the
incoming rows. This operator uses lazy evaluation and has a lower memory pressure in the system than
the Top operator. Partial top is only applicable when sorting on multiple columns.

821
Example 466. PartialTop

Query

PROFILE
MATCH (p:Person)
WHERE p.name STARTS WITH 'P'
RETURN p
ORDER BY p.name, p.age
LIMIT 2

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+-----------------------
+---------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by
| Pipeline |
+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+-----------------------
+---------------------+
| +ProduceResults | p
| 0 | 2 | 0 | | 2/0 | 0.093 |
| |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
|
| +PartialTop | `p.name` ASC, `p.age` ASC LIMIT 2
| 0 | 2 | 0 | 640 | 0/0 | 0.870 | p.name ASC,
p.age ASC | In Pipeline 1 |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+-----------------------
+---------------------+
| +Projection | cache[p.name] AS `p.name`, p.age AS `p.age`
| 0 | 2 | 0 | | | | `p.name`
ASC | |
| |
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+ | +-----------------------+
|
| +NodeIndexSeekByRange | RANGE INDEX p:Person(name) WHERE name STARTS WITH $autostring_0,
cache[p.name] | 0 | 2 | 3 | 120 | 0/1 | 0.556
| p.name ASC | Fused in Pipeline 0 |
+-----------------------
+--------------------------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+-----------------------
+---------------------+

Total database accesses: 3, total allocated memory: 704

Union
The Union operator concatenates the results from the right child operator with the results from the left
child operator.

822
Example 467. Union

Query

PROFILE
MATCH (p:Location)
RETURN p.name
UNION ALL
MATCH (p:Country)
RETURN p.name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+--------------------+----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) |
Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------+----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | `p.name` | 20 | 0 | 0 | |
| | |
| | +----+--------------------+----------------+------+---------+----------------+
| | |
| +Union | 1 | | 20 | 0 | 0 | 0 |
0/0 | 0.000 | Fused in Pipeline 2 |
| |\ +----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| | +Projection | 2 | `p.name` | 10 | 0 | 0 | |
| | |
| | | +----+--------------------+----------------+------+---------+----------------+
| | |
| | +Projection | 3 | p.name AS `p.name` | 10 | 0 | 0 | |
| | |
| | | +----+--------------------+----------------+------+---------+----------------+
| | |
| | +NodeByLabelScan | 4 | p:Country | 10 | 0 | 0 | 120 |
0/0 | 0.049 | Fused in Pipeline 1 |
| | +----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +Projection | 5 | `p.name` | 10 | 0 | 0 | |
| | |
| | +----+--------------------+----------------+------+---------+----------------+
| | |
| +Projection | 6 | p.name AS `p.name` | 10 | 0 | 0 | |
| | |
| | +----+--------------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | 7 | p:Location | 10 | 0 | 0 | 120 |
0/0 | 0.077 | Fused in Pipeline 0 |
+--------------------+----+--------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 0, total allocated memory: 320

Unwind
The Unwind operator returns one row per item in a list.

823
Example 468. Unwind

Query

PROFILE
UNWIND range(1, 5) AS value
RETURN value

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----------------------------------------+----------------+------+---------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Page
Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----------------------------------------+----------------+------+---------
+------------------------+-----------+---------------------+
| +ProduceResults | value | 10 | 5 | 0 |
| | |
| | +----------------------------------------+----------------+------+---------+
| | |
| +Unwind | range($autoint_0, $autoint_1) AS value | 10 | 5 | 0 |
0/0 | 0.000 | Fused in Pipeline 0 |
+-----------------+----------------------------------------+----------------+------+---------
+------------------------+-----------+---------------------+

Total database accesses: 0, total allocated memory: 184

Partitioned Unwind Label—new 5.17


The PartitionedUnwind is a variant of the Unwind operator used by the parallel runtime. It allows the index
to be partitioned into different segments where each segment can be scanned independently in parallel.

824
Example 469. PartitionedUnwind

Query

CYPHER runtime=parallel
PROFILE
UNWIND range(1, 5) AS value
RETURN value

Query Plan

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+--------------------+----+----------------------------------------+----------------+------+---------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits
| Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------+----+----------------------------------------+----------------+------+---------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | value | 10 | 5 | 0
| 0/0 | 0.119 | In Pipeline 1 |
| | +----+----------------------------------------+----------------+------+---------
+------------------------+-----------+---------------------+
| +PartitionedUnwind | 1 | range($autoint_0, $autoint_1) AS value | 10 | 5 | 0
| | | Fused in Pipeline 0 |
+--------------------+----+----------------------------------------+----------------+------+---------
+------------------------+-----------+---------------------+

Total database accesses: 0

Exhaustive Limit
The ExhaustiveLimit operator is similar to the Limit operator but will always exhaust the input. Used
when combining LIMIT and updates

825
Example 470. ExhaustiveLimit

Query

PROFILE
MATCH (p:Person)
SET p.seen = true
RETURN p
LIMIT 3

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page
Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+----+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | p | 3 | 3 | 10 | 0 |
| | |
| | +----+---------------+----------------+------+---------+----------------+
| | |
| +ExhaustiveLimit | 1 | 3 | 3 | 3 | 0 | 32 |
| | |
| | +----+---------------+----------------+------+---------+----------------+
| | |
| +SetProperty | 2 | p.seen = true | 17 | 17 | 34 | |
| | |
| | +----+---------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | 3 | p:Person | 17 | 17 | 18 | 240 |
3/0 | 1.966 | Fused in Pipeline 0 |
+------------------+----+---------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 62, total allocated memory: 304

Optional
The Optional operator is used to solve some OPTIONAL MATCH queries. It will pull data from its source,
simply passing it through if any data exists. However, if no data is returned by its source, Optional will
yield a single row with all columns set to null.

826
Example 471. Optional

Query

PROFILE
MATCH (p:Person {name: 'me'})
OPTIONAL MATCH (q:Person {name: 'Lulu'})
RETURN p, q

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------+
| Operator | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------+
| +ProduceResults | p, q | 1 | 1 |
0 | | 2/0 | 0.079 | In Pipeline 2 |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------+
| +Apply | | 1 | 1 |
0 | | 0/0 | 0.096 | |
| |\ +-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------+
| | +Optional | p | 1 | 1 |
0 | 768 | 0/0 | 0.043 | In Pipeline 2 |
| | | +-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------+
| | +NodeIndexSeek | RANGE INDEX q:Person(name) WHERE name = $autostring_1 | 1 | 0 |
1 | 2152 | 1/0 | 0.098 | In Pipeline 1 |
| | +-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------+
| +NodeIndexSeek | RANGE INDEX p:Person(name) WHERE name = $autostring_0 | 1 | 1 |
2 | 120 | 0/1 | 0.364 | In Pipeline 0 |
+------------------+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------+

Total database accesses: 3, total allocated memory: 3000

Project Endpoints
The ProjectEndpoints operator projects the start and end node of a relationship.

827
Example 472. ProjectEndpoints

Query

PROFILE
CREATE (n)-[p:KNOWS]->(m)
WITH p AS r
MATCH (u)-[r]->(v)
RETURN u, v

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------+----+-----------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB
Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------+----+-----------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | u, v | 1 | 1 |
2 | 0 | | | |
| | +----+-----------------------------------------+----------------+------
+---------+----------------+ | | |
| +Apply | 1 | | 1 | 1 |
0 | | | | |
| |\ +----+-----------------------------------------+----------------+------
+---------+----------------+ | | |
| | +ProjectEndpoints | 2 | (u)-[r]->(v) | 1 | 1 |
0 | | | | |
| | | +----+-----------------------------------------+----------------+------
+---------+----------------+ | | |
| | +Argument | 3 | r | 1 | 1 |
0 | 4328 | 0/0 | 0.194 | Fused in Pipeline 2 |
| | +----+-----------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +Eager | 4 | read/create conflict (Operator: 6 vs 2) | 1 | 1 |
0 | 368 | 0/0 | 0.025 | In Pipeline 1 |
| | +----+-----------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +Projection | 5 | p AS r | 1 | 1 |
0 | | | | |
| | +----+-----------------------------------------+----------------+------
+---------+----------------+ | | |
| +Create | 6 | (n), (m), (n)-[p:KNOWS]->(m) | 1 | 1 |
3 | | 0/0 | 0.000 | Fused in Pipeline 0 |
+---------------------+----+-----------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 5, total allocated memory: 4920

Projection
For each incoming row, the Projection operator evaluates a set of expressions and produces a row with
the results of the expressions.

828
Example 473. Projection

Query

PROFILE
RETURN 'hello' AS greeting

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+---------------------------+----------------+------+---------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------+---------------------------+----------------+------+---------
+------------------------+-----------+---------------------+
| +ProduceResults | greeting | 1 | 1 | 0 |
| | |
| | +---------------------------+----------------+------+---------+
| | |
| +Projection | $autostring_0 AS greeting | 1 | 1 | 0 |
0/0 | 0.000 | Fused in Pipeline 0 |
+-----------------+---------------------------+----------------+------+---------
+------------------------+-----------+---------------------+

Total database accesses: 0, total allocated memory: 184

Shortest path
The ShortestPath operator finds one or all shortest paths between two previously matched node variables.
This operator is used for the shortestPath() and allShortestPaths functions.

829
Example 474. ShortestPath

Query

PROFILE
MATCH
(andy:Person {name: 'Andy'}),
(mattias:Person {name: 'Mattias'}),
p = shortestPath((andy)-[*]-(mattias))
RETURN p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------+-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------+
| Operator | Details | Estimated Rows
| Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------+-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------+
| +ProduceResults | p | 1
| 1 | 0 | | 1/0 | 0.241 | |
| | +-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +ShortestPath | p = (andy)-[anon_0*]-(mattias) | 1
| 1 | 1 | 1424 | | | In Pipeline 1 |
| | +-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------+
| +MultiNodeIndexSeek | RANGE INDEX andy:Person(name) WHERE name = $autostring_0, | 1
| 1 | 4 | 120 | 1/1 | 0.308 | In Pipeline 0 |
| | RANGE INDEX mattias:Person(name) WHERE name = $autostring_1 |
| | | | | | |
+---------------------+-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------+

Total database accesses: 5, total allocated memory: 1488

StatefulShortestPath(Into) Label—new 5.21


The StatefulShortestPath(Into) operator finds shortest paths between a start node and a single target
node. It uses a bidirectional breadth-first search (BFS) algorithm, which performs two BFS invocations at
the same time, one from the left boundary node and one from the right boundary node. Once a node is
found by both BFS invocations, which indicates that it can be reached from both boundary nodes, the
algorithm successfully terminates. If one of the BFS invocations exhausts its search before intersecting,
either because no further nodes can be reached or because the maximum number of hops has been
reached, then there is no valid path between the boundary nodes and the algorithm terminates.

830
Example 475. StatefulShortestPath(Into)

Query

PROFILE
MATCH
p = ALL SHORTEST (chris:Person {name: 'Chris'})(()-[]-()-[]-()){1,}(stefan:Person {name: 'Stefan'})
RETURN p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------------+----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+-----------------------------+----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +ProduceResults | 0 | p
| 2 | 2 | 0 | 0 | 0/0 | 0.039 |
|
| | +----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
|
| +Projection | 1 | (chris) ((anon_12)-[anon_14]-(anon_13)-[anon_11]-())* (stefan)
AS p | 2 | 2 | 0 | |
0/0 | 1.365 | |
| | +----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
|
| +StatefulShortestPath(Into) | 2 | SHORTEST 1 GROUPS (chris) ((`anon_5`)-[`anon_6`]-(`anon_7`)-
[`anon_8`]-(`anon_9`)){1, } (stefan) | 2 | 2 | 39 | 22237 |
1/0 | 37.376 | In Pipeline 1 |
| | +----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +MultiNodeIndexSeek | 3 | UNIQUE chris:Person(name) WHERE name = $autostring_0,
| 1 | 1 | 4 | 376 | 1/1 | 10.245 | In Pipeline
0 |
| | | UNIQUE stefan:Person(name) WHERE name = $autostring_1
| | | | | | |
|
+-----------------------------+----
+--------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+

Total database accesses: 43, total allocated memory: 22557

StatefulShortestPath(All) Label—new 5.21


The StatefulShortestPath(All) operator finds shortest paths from a single node to multiple target nodes.
It uses a breadth-first search algorithm.

831
Example 476. StatefulShortestPath(All)

Query

PROFILE
MATCH
p = ALL SHORTEST (chris:Person {name:'Chris'})(()-[]-()-[]-()){1,}(location:Location)
RETURN length(p) AS pathLength, location.name AS locationName

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+----------------------------+----
+----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+----------------------------+----
+----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +ProduceResults | 0 | pathLength, locationName
| 14 | 20 | 0 | 0 | 0/0 | 0.074 |
|
| | +----
+----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
|
| +Projection | 1 | length((chris) ((anon_12)-[anon_14]-(anon_13)-[anon_11]-())*
(location)) AS pathLength, | 14 | 20 | 40 | |
1/0 | 6.828 | |
| | | | location.name AS locationName
| | | | | | |
|
| | +----
+----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
|
| +StatefulShortestPath(All) | 2 | SHORTEST 1 GROUPS (chris) ((`anon_5`)-[`anon_6`]-(`anon_7`)-
[`anon_8`]-(`anon_9`)){1, } (location) | 14 | 20 | 179 | 37663 |
1/0 | 52.849 | In Pipeline 1 |
| | | | expanding from: chris
| | | | | | |
|
| | | | inlined predicates: location:Location
| | | | | | |
|
| | +----
+----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +NodeUniqueIndexSeek | 3 | UNIQUE chris:Person(name) WHERE name = $autostring_0
| 1 | 1 | 2 | 376 | 0/1 | 9.078 | In Pipeline
0 |
+----------------------------+----
+----------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+

Total database accesses: 221, total allocated memory: 37983

832
Empty Row
The EmptyRow operator returns a single row with no columns.

Example 477. EmptyRow

Query

PROFILE
CYPHER runtime=slotted
FOREACH (value IN [1,2,3] | MERGE (:Person {age: value}))

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------+--------------------------------------+----------------+------+---------
+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Page
Cache Hits/Misses |
+--------------------+--------------------------------------+----------------+------+---------
+------------------------+
| +ProduceResults | | 1 | 0 | 0 |
0/0 |
| | +--------------------------------------+----------------+------+---------
+------------------------+
| +EmptyResult | | 1 | 0 | 0 |
0/0 |
| | +--------------------------------------+----------------+------+---------
+------------------------+
| +Foreach | value IN [1, 2, 3] | 1 | 1 | 0 |
0/0 |
| |\ +--------------------------------------+----------------+------+---------
+------------------------+
| | +Merge | CREATE (anon_0:Person {age: value}) | 1 | 3 | 9 |
0/0 |
| | | +--------------------------------------+----------------+------+---------
+------------------------+
| | +Filter | anon_0.age = value | 1 | 0 | 184 |
2/0 |
| | | +--------------------------------------+----------------+------+---------
+------------------------+
| | +NodeByLabelScan | anon_0:Person | 35 | 108 | 111 |
3/0 |
| | +--------------------------------------+----------------+------+---------
+------------------------+
| +EmptyRow | | 1 | 1 | 0 |
0/0 |
+--------------------+--------------------------------------+----------------+------+---------
+------------------------+

Total database accesses: 304, total allocated memory: 64

Procedure Call
The ProcedureCall operator indicates an invocation to a procedure.

833
Example 478. ProcedureCall

Query

PROFILE
CALL db.labels() YIELD label
RETURN *
ORDER BY label

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by | Pipeline |
+-----------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------------+
| +ProduceResults | label | 10 | 4 | 0 |
| 0/0 | 0.091 | | |
| | +-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+ | |
| +Sort | label ASC | 10 | 4 | 0 |
536 | 0/0 | 0.178 | label ASC | In Pipeline 1 |
| | +-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------------+
| +ProcedureCall | db.labels() :: (label :: STRING) | 10 | 4 | |
| | | | Fused in Pipeline 0 |
+-----------------+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+------------+---------------------+

Total database accesses: ?, total allocated memory: 600

Cache Properties
The CacheProperties operator reads nodes and relationship properties and caches them in the current
row. Future accesses to these properties can avoid reading from the store which will speed up the query.
In the plan below we will cache l.name before Expand(All) where there are fewer rows.

834
Example 479. CacheProperties

Query

PROFILE
MATCH (l:Location)<-[:WORKS_IN]-(p:Person)
RETURN
l.name AS location,
p.name AS name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----+-------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits
| Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+----+-------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | location, name | 13 | 13 | 0
| | | | |
| | +----+-------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Projection | 1 | cache[l.name] AS location, p.name AS name | 13 | 13 | 26
| | | | |
| | +----+-------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Filter | 2 | p:Person | 13 | 13 | 26
| | | | |
| | +----+-------------------------------------------+----------------+------
+---------+----------------+ | | |
| +Expand(All) | 3 | (l)<-[anon_0:WORKS_IN]-(p) | 13 | 13 | 24
| | | | |
| | +----+-------------------------------------------+----------------+------
+---------+----------------+ | | |
| +CacheProperties | 4 | cache[l.name] | 10 | 10 | 20
| | | | |
| | +----+-------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeByLabelScan | 5 | l:Location | 10 | 10 | 11
| 120 | 4/0 | 0.344 | Fused in Pipeline 0 |
+------------------+----+-------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 107, total allocated memory: 200

Create (nodes and relationships)


The Create operator is used to create nodes and relationships.

835
Example 480. Create

Query

PROFILE
CREATE
(max:Person {name: 'Max'}),
(chris:Person {name: 'Chris'})
CREATE (max)-[:FRIENDS_WITH]->(chris)

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+---------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+
| Operator | Details |
Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+---------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+
| +ProduceResults | |
1 | 0 | 0 | | | |
| | +---------------------------------------------------------------------------
+----------------+------+---------+ | | |
| +EmptyResult | |
1 | 0 | 0 | | | |
| | +---------------------------------------------------------------------------
+----------------+------+---------+ | | |
| +Create | (max:Person {name: $autostring_0}), (chris:Person {name: $autostring_1}), |
1 | 1 | 7 | 0/0 | 0.000 | Fused in Pipeline 0 |
| | (max)-[anon_0:FRIENDS_WITH]->(chris) |
| | | | | |
+-----------------+---------------------------------------------------------------------------
+----------------+------+---------+------------------------+-----------+---------------------+

Total database accesses: 7, total allocated memory: 184

Delete (nodes and relationships)


The Delete operator is used to delete a node or a relationship.

836
Example 481. Delete

Query

PROFILE
MATCH (you:Person {name: 'you'})
DELETE you

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----+--------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----+--------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | | 0 |
0 | 0 | | | | |
| | +----+--------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +EmptyResult | 1 | | 0 |
0 | 0 | | | | |
| | +----+--------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +Delete | 2 | you | 0 |
0 | 0 | | | | |
| | +----+--------------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +NodeIndexSeek | 3 | RANGE INDEX you:Person(name) WHERE name = $autostring_0 | 0 |
0 | 1 | 120 | 1/0 | 0.330 | Fused in Pipeline 0 |
+-----------------+----+--------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 13, total allocated memory: 216

Detach Delete
The DetachDelete operator is used in all queries containing the DETACH DELETE clause, when deleting
nodes and their relationships.

837
Example 482. DetachDelete

Query

PROFILE
MATCH (p:Person)
DETACH DELETE p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+------------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | | 14 | 0 | 0 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +EmptyResult | | 14 | 0 | 0 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +DetachDelete | p | 14 | 14 | 41 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | p:Person | 14 | 14 | 35 | 120 |
21/0 | 12,439 | Fused in Pipeline 0 |
+------------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 112, total allocated memory: 200

Set Labels
The SetLabels operator is used when setting labels on a node.

838
Example 483. SetLabels

Query

PROFILE
MATCH (n)
SET n:Person

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | | 35 | 0 | 0 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +EmptyResult | | 35 | 0 | 0 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +SetLabels | n:Person | 35 | 35 | 22 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +AllNodesScan | n | 35 | 35 | 36 | 120 |
3/0 | 0.873 | Fused in Pipeline 0 |
+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 58, total allocated memory: 184

Remove Labels
The RemoveLabels operator is used when deleting labels from a node.

839
Example 484. RemoveLabels

Query

PROFILE
MATCH (n)
REMOVE n:Person

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | | 35 | 0 | 0 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +EmptyResult | | 35 | 0 | 0 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +RemoveLabels | n:Person | 35 | 35 | 15 | |
| | |
| | +----------+----------------+------+---------+----------------+
| | |
| +AllNodesScan | n | 35 | 35 | 36 | 120 |
3/0 | 0.765 | Fused in Pipeline 0 |
+-----------------+----------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 51, total allocated memory: 184

Set Node Properties From Map


The SetNodePropertiesFromMap operator is used when setting properties from a map on a node.

840
Example 485. SetNodePropertiesFromMap

Query

PROFILE
MATCH (n)
SET n = {weekday: 'Monday', meal: 'Lunch'}

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------------+---------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows |
Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------------+---------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | | 35 |
0 | 0 | | | | |
| | +---------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +EmptyResult | | 35 |
0 | 0 | | | | |
| | +---------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +SetNodePropertiesFromMap | n = {weekday: $autostring_0, meal: $autostring_1} | 35 |
35 | 105 | | | | |
| | +---------------------------------------------------+----------------
+------+---------+----------------+ | | |
| +AllNodesScan | n | 35 |
35 | 36 | 120 | 5/0 | 3.954 | Fused in Pipeline 0 |
+---------------------------+---------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 141, total allocated memory: 184

Set Relationship Properties From Map


The SetRelationshipPropertiesFromMap operator is used when setting properties from a map on a
relationship.

841
Example 486. SetRelationshipPropertiesFromMap

Query

PROFILE
MATCH (n)-[r]->(m)
SET r = {weight: 5, unit: 'kg'}

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------------------+-----------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows
| Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------------------------+-----------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | | 28
| 0 | 0 | | | | |
| | +-----------------------------------------------+----------------
+------+---------+----------------+ | | |
| +EmptyResult | | 28
| 0 | 0 | | | | |
| | +-----------------------------------------------+----------------
+------+---------+----------------+ | | |
| +SetRelationshipPropertiesFromMap | r = {weight: $autoint_0, unit: $autostring_1} | 28
| 28 | 84 | | | | |
| | +-----------------------------------------------+----------------
+------+---------+----------------+ | | |
| +DirectedAllRelationshipsScan | (n)-[r]->(m) | 28
| 28 | 28 | 120 | 5/0 | 15.278 | Fused in Pipeline 0 |
+-----------------------------------+-----------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 112, total allocated memory: 184

Set Property
The SetProperty operator is used when setting a property on a node or relationship.

842
Example 487. SetProperty

Query

PROFILE
MATCH (n)
SET n.checked = true

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache
Hits/Misses | Time (ms) | Pipeline |
+-----------------+------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | | 35 | 0 | 0 | |
| | |
| | +------------------+----------------+------+---------+----------------+
| | |
| +EmptyResult | | 35 | 0 | 0 | |
| | |
| | +------------------+----------------+------+---------+----------------+
| | |
| +SetProperty | n.checked = true | 35 | 35 | 70 | |
| | |
| | +------------------+----------------+------+---------+----------------+
| | |
| +AllNodesScan | n | 35 | 35 | 36 | 120 |
3/0 | 0.753 | Fused in Pipeline 0 |
+-----------------+------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 106, total allocated memory: 184

Set Properties
The SetProperties operator is used when setting multiple properties on a node or relationship.

843
Example 488. SetProperties

Query

PROFILE
MATCH (n)
SET n.weekDay = 'Monday', n.meal = 'Lunch'

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----+---------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----+---------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | | 35 | 0 |
0 | 0 | | | |
| | +----+---------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +EmptyResult | 1 | | 35 | 0 |
0 | | | | |
| | +----+---------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +SetProperties | 2 | n.weekDay = $autostring_0, n.meal = $autostring_1 | 35 | 35 |
105 | | | | |
| | +----+---------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +AllNodesScan | 3 | n | 35 | 35 |
36 | 248 | 3/0 | 152.289 | Fused in Pipeline 0 |
+-----------------+----+---------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 141, total allocated memory: 312

Create Constraint
The CreateConstraint operator creates a constraint.

This constraint can have any of the available constraint types:

• Property uniqueness constraints

• Property existence constraints Enterprise edition

• Property type constraints Enterprise edition

• Node or relationship key constraints Enterprise edition

The following query will create a property uniqueness constraint with the name uniqueness on the name
property of nodes with the Country label.

844
Example 489. CreateConstraint

Query

PROFILE
CREATE CONSTRAINT uniqueness
FOR (c:Country) REQUIRE c.name is UNIQUE

Query Plan

Planner ADMINISTRATION

Runtime SCHEMA

Runtime version 5.25

+-------------------+------------------------------------------------------------------+
| Operator | Details |
+-------------------+------------------------------------------------------------------+
| +CreateConstraint | CONSTRAINT uniqueness FOR (c:Country) REQUIRE (c.name) IS UNIQUE |
+-------------------+------------------------------------------------------------------+

Total database accesses: ?

Do Nothing If Exists (constraint)


To not get an error creating the same constraint twice, we use the DoNothingIfExists operator for
constraints. This will make sure no other constraint with the given name or another constraint of the same
type and schema already exists before the specific CreateConstraint operator creates the constraint. If it
finds a constraint with the given name or with the same type and schema it will stop the execution and no
new constraint is created. The following query will create a property uniqueness constraint with the name
uniqueness on the name property of nodes with the Country label only if no constraint named uniqueness or
property uniqueness constraint on (:Country {name}) already exists.

845
Example 490. DoNothingIfExists(CONSTRAINT)

Query

PROFILE
CREATE CONSTRAINT uniqueness IF NOT EXISTS
FOR (c:Country) REQUIRE c.name is UNIQUE

Query Plan

Planner ADMINISTRATION

Runtime SCHEMA

Runtime version 5.25

+--------------------------------+------------------------------------------------------------------+
| Operator | Details |
+--------------------------------+------------------------------------------------------------------+
| +CreateConstraint | CONSTRAINT uniqueness FOR (c:Country) REQUIRE (c.name) IS UNIQUE |
| | +------------------------------------------------------------------+
| +DoNothingIfExists(CONSTRAINT) | CONSTRAINT uniqueness FOR (c:Country) REQUIRE (c.name) IS UNIQUE |
+--------------------------------+------------------------------------------------------------------+

Total database accesses: ?

Drop Constraint
The DropConstraint operator removes a constraint using the name of the constraint, no matter the type.

Example 491. DropConstraint

Query

PROFILE
DROP CONSTRAINT uniqueness IF EXISTS

Query Plan

Planner ADMINISTRATION

Runtime SCHEMA

Runtime version 5.25

+-----------------+---------------------------------+
| Operator | Details |
+-----------------+---------------------------------+
| +DropConstraint | CONSTRAINT uniqueness IF EXISTS |
+-----------------+---------------------------------+

Total database accesses: ?

Show Constraints
The ShowConstraints operator lists constraints. It may include filtering on constraint type and can have
either default or full output.

846
Example 492. ShowConstraints

Query

PROFILE
SHOW CONSTRAINTS

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+------------------+-------------------------------------------------------------------
+----------------+------+---------+------------------------+
| Operator | Details | Estimated
Rows | Rows | DB Hits | Page Cache Hits/Misses |
+------------------+-------------------------------------------------------------------
+----------------+------+---------+------------------------+
| +ProduceResults | id, name, type, entityType, labelsOrTypes, properties, ownedIndex |
10 | 3 | 0 | 0/0 |
| | +-------------------------------------------------------------------
+----------------+------+---------+------------------------+
| +ShowConstraints | allConstraints, defaultColumns |
10 | 3 | 2 | 0/0 |
+------------------+-------------------------------------------------------------------
+----------------+------+---------+------------------------+

Total database accesses: 2, total allocated memory: 64

Create Index
The CreateIndex operator creates an index.

This index can either be a fulltext, point, range, text, vector, or lookup index.

847
Example 493. CreateIndex

The following query will create an index with the name my_index on the name property of nodes with
the Country label.

Query

PROFILE
CREATE INDEX my_index
FOR (c:Country) ON (c.name)

Query Plan

Planner ADMINISTRATION

Runtime SCHEMA

Runtime version 5.25

+--------------+-----------------------------------------------+
| Operator | Details |
+--------------+-----------------------------------------------+
| +CreateIndex | RANGE INDEX my_index FOR (:Country) ON (name) |
+--------------+-----------------------------------------------+

Total database accesses: ?

Do Nothing If Exists (index)


To not get an error creating the same index twice, we use the DoNothingIfExists operator for indexes.
This will make sure no other index with the given name or schema already exists before the CreateIndex
operator creates an index. If it finds an index with the given name or schema it will stop the execution and
no new index is created. The following query will create an index with the name my_index on the since
property of relationships with the KNOWS relationship type only if no such index already exists.

848
Example 494. DoNothingIfExists(INDEX)

Query

PROFILE
CREATE INDEX my_index IF NOT EXISTS
FOR ()-[k:KNOWS]-() ON (k.since)

Query Plan

Planner ADMINISTRATION

Runtime SCHEMA

Runtime version 5.25

+---------------------------+----------------------------------------------------+
| Operator | Details |
+---------------------------+----------------------------------------------------+
| +CreateIndex | RANGE INDEX my_index FOR ()-[:KNOWS]-() ON (since) |
| | +----------------------------------------------------+
| +DoNothingIfExists(INDEX) | RANGE INDEX my_index FOR ()-[:KNOWS]-() ON (since) |
+---------------------------+----------------------------------------------------+

Total database accesses: ?

Drop Index
The DropIndex operator removes an index using the name of the index.

Example 495. DropIndex

Query

PROFILE
DROP INDEX my_index

Query Plan

Planner ADMINISTRATION

Runtime SCHEMA

Runtime version 5.25

+------------+---------------+
| Operator | Details |
+------------+---------------+
| +DropIndex | INDEX my_index|
+------------+---------------+

Total database accesses: ?

Show Indexes
The ShowIndexes operator lists indexes. It may include filtering on index type and can have either default or
full output.

849
Example 496. ShowIndexes

Query

PROFILE
SHOW INDEXES

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------
+-------------------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses |
+-----------------
+-------------------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+
| +ProduceResults | id, name, state, populationPercent, type, entityType, labelsOrTypes, properties,
indexProvider, | 10 | 9 | 0 | 0/0 |
| | | owningConstraint
| | | | |
| |
+-------------------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+
| +ShowIndexes | allIndexes, defaultColumns
| 10 | 9 | 2 | 0/0 |
+-----------------
+-------------------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+

Total database accesses: 2, total allocated memory: 64

Show Functions
The ShowFunctions operator lists functions. It may include filtering on built-in vs user-defined functions as
well as if a given user can execute the function. The output can either be default or full output.

850
Example 497. ShowFunctions

Query

PROFILE
SHOW FUNCTIONS

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------+-----------------------------------------------------+----------------+------
+---------+------------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Page Cache Hits/Misses |
+-----------------+-----------------------------------------------------+----------------+------
+---------+------------------------+
| +ProduceResults | name, category, description | 10 | 147 |
0 | 0/0 |
| | +-----------------------------------------------------+----------------+------
+---------+------------------------+
| +ShowFunctions | allFunctions, functionsForUser(all), defaultColumns | 10 | 147 |
0 | 0/0 |
+-----------------+-----------------------------------------------------+----------------+------
+---------+------------------------+

Total database accesses: 0, total allocated memory: 64

Show Procedures
The ShowProcedures operator lists procedures. It may include filtering on whether a given user can execute
the procedure and can have either default or full output.

851
Example 498. ShowProcedures

Query

PROFILE
SHOW PROCEDURES

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------+----------------------------------------+----------------+------+---------
+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Page
Cache Hits/Misses |
+-----------------+----------------------------------------+----------------+------+---------
+------------------------+
| +ProduceResults | name, description, mode, worksOnSystem | 10 | 55 | 0 |
0/0 |
| | +----------------------------------------+----------------+------+---------
+------------------------+
| +ShowProcedures | proceduresForUser(all), defaultColumns | 10 | 55 | 0 |
0/0 |
+-----------------+----------------------------------------+----------------+------+---------
+------------------------+

Total database accesses: 0, total allocated memory: 64

Show Settings
The ShowSettings operator lists configuration settings.

852
Example 499. ShowSettings

Query

PROFILE
SHOW SETTINGS

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-----------------+---------------------------------------------------+----------------+------
+---------+------------------------+
| Operator | Details | Estimated Rows | Rows | DB
Hits | Page Cache Hits/Misses |
+-----------------+---------------------------------------------------+----------------+------
+---------+------------------------+
| +ProduceResults | name, value, isDynamic, defaultValue, description | 10 | 264 |
0 | 0/0 |
| | +---------------------------------------------------+----------------+------
+---------+------------------------+
| +ShowSettings | allSettings, defaultColumns | 10 | 264 |
0 | 0/0 |
+-----------------+---------------------------------------------------+----------------+------
+---------+------------------------+

Total database accesses: 0, total allocated memory: 64

Show Transactions
The ShowTransactions operator lists transactions. It may include filtering on given ids and can have either
default or full output.

853
Example 500. ShowTransactions

Query

PROFILE
SHOW TRANSACTIONS

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+-------------------
+-----------------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Page Cache Hits/Misses |
+-------------------
+-----------------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+
| +ProduceResults | database, transactionId, currentQueryId, connectionId, clientAddress, username,
currentQuery, | 10 | 1 | 0 | 0/0 |
| | | startTime, status, elapsedTime
| | | | |
| |
+-----------------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+
| +ShowTransactions | defaultColumns, allTransactions
| 10 | 1 | 0 | 0/0 |
+-------------------
+-----------------------------------------------------------------------------------------------
+----------------+------+---------+------------------------+

Total database accesses: 0, total allocated memory: 64

Terminate Transactions
The TerminateTransactions operator terminates transactions by ID.

854
Example 501. TerminateTransactions

Query

PROFILE
TERMINATE TRANSACTIONS 'database-transaction-123'

Query Plan

Planner COST

Runtime SLOTTED

Runtime version 5.25

+------------------------+--------------------------------------------------------+----------------
+------+---------+------------------------+
| Operator | Details | Estimated Rows |
Rows | DB Hits | Page Cache Hits/Misses |
+------------------------+--------------------------------------------------------+----------------
+------+---------+------------------------+
| +ProduceResults | transactionId, username, message | 10 |
1 | 0 | 0/0 |
| | +--------------------------------------------------------+----------------
+------+---------+------------------------+
| +TerminateTransactions | defaultColumns, transactions(database-transaction-123) | 10 |
1 | 0 | 0/0 |
+------------------------+--------------------------------------------------------+----------------
+------+---------+------------------------+

Total database accesses: 0, total allocated memory: 64

Cypher runtimes
The runtime is the final step of a Cypher query, where query plans received from the planner are executed
as quickly and efficiently as possible.

Cypher has three available runtimes: slotted, pipelined, and parallel. Though the default runtime generally
provides the best performance, there are situations when deciding which runtime to use is an important
part of maximizing the efficiency of queries. It is, therefore, important for advanced users to understand
the different runtimes offered by Neo4j.

This chapter contains the following sections:

• Runtime concepts - a deep-dive into the concepts behind Cypher runtimes.

• Parallel runtime: reference - information about queries, configuration settings, and using the parallel
runtime on Aura.

Runtime concepts
In Cypher, there are three types of runtimes: slotted, pipelined, and parallel. In general, the default runtimes
(the pipelined runtime in Enterprise Edition) provide the best query performance. However, each runtime
offers advantages and disadvantages, and there are scenarios when deciding which runtime to use is an
important step in maximizing the efficiency of queries.

This is a step-by-step guide to the concepts behind each of the three available Cypher runtimes. For

855
readers not familiar with reading the execution plans produced by Cypher queries, it is recommended to
first read the section on Understanding execution plans.

Example graph
The following graph is used for the queries on this page:

departs: 17:13 departs: 17:01

XT
departs: 17:07 NE
XT
NE NE

CALL
arrives: 17:19 X

CALLS
T departs: 17:11

S_AT
NEXT
AT _

CAL
CALLS_A

LS_A
CALLS

T
_AT
T

name: Wandsworth Road name: Peckham Rye

name: Denmark Hill


name: Clapham Junction

Station
name: Clapham High Street

AT
CALLS_A

LS_
CAL
Stop
T

NEXT

arrives: 17:17 departs: 17:10

The graph contains two types of nodes: Stop and Station. Each Stop on a train service CALLS_AT one
Station, and has the properties arrives and departs that give the times the train is at the Station.
Following the NEXT relationship of a Stop will give the next Stop of a service.

To recreate the graph, run the following query against an empty Neo4j database:

Query

CREATE (pmr:Station {name: 'Peckham Rye'}),


(dmk:Station {name: 'Denmark Hill'}),
(clp:Station {name: 'Clapham High Street'}),
(wwr:Station {name: 'Wandsworth Road'}),
(clj:Station {name: 'Clapham Junction'}),
(s1:Stop {arrives: time('17:19'), departs: time('17:20')}),
(s2:Stop {arrives: time('17:12'), departs: time('17:13')}),
(s3:Stop {arrives: time('17:10'), departs: time('17:11')}),
(s4:Stop {arrives: time('17:06'), departs: time('17:07')}),
(s5:Stop {arrives: time('16:58'), departs: time('17:01')}),
(s6:Stop {arrives: time('17:17'), departs: time('17:20')}),
(s7:Stop {arrives: time('17:08'), departs: time('17:10')}),
(clj)<-[:CALLS_AT]-(s1), (wwr)<-[:CALLS_AT]-(s2),
(clp)<-[:CALLS_AT]-(s3), (dmk)<-[:CALLS_AT]-(s4),
(pmr)<-[:CALLS_AT]-(s5), (clj)<-[:CALLS_AT]-(s6),
(dmk)<-[:CALLS_AT]-(s7),
(s5)-[:NEXT {distance: 1.2}]->(s4),(s4)-[:NEXT {distance: 0.34}]->(s3),
(s3)-[:NEXT {distance: 0.76}]->(s2), (s2)-[:NEXT {distance: 0.3}]->(s1),
(s7)-[:NEXT {distance: 1.4}]->(s6)

Slotted runtime
The slotted runtime is the default runtime for Neo4j Community Edition. Users of Neo4j Enterprise Edition

856
must prepend their query with CYPHER runtime = slotted in order for a query to run with slotted runtime.
For example:

Query

EXPLAIN
CYPHER runtime = slotted
MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)
((:Stop)-[:NEXT]->(:Stop))+
(a:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })
RETURN count(*)

This query will generate the following execution plan:

857
Planner COST

Runtime SLOTTED

Runtime version 5.25

+-------------------+----+------------------------------------------------------------------------
+----------------+
| Operator | Id | Details |
Estimated Rows |
+-------------------+----+------------------------------------------------------------------------
+----------------+
| +ProduceResults | 0 | `count(*)` |
1 |
| | +----+------------------------------------------------------------------------
+----------------+
| +EagerAggregation | 1 | count(*) AS `count(*)` |
1 |
| | +----+------------------------------------------------------------------------
+----------------+
| +Filter | 2 | not anon_1 = anon_5 AND anon_0.name = $autostring_0 AND anon_0:Station |
0 |
| | +----+------------------------------------------------------------------------
+----------------+
| +Expand(All) | 3 | (d)-[anon_1:CALLS_AT]->(anon_0) |
0 |
| | +----+------------------------------------------------------------------------
+----------------+
| +Filter | 4 | d:Stop |
0 |
| | +----+------------------------------------------------------------------------
+----------------+
| +Repeat(Trail) | 5 | (a) (...){1, *} (d) |
0 |
| |\ +----+------------------------------------------------------------------------
+----------------+
| | +Filter | 6 | isRepeatTrailUnique(anon_7) AND anon_2:Stop |
6 |
| | | +----+------------------------------------------------------------------------
+----------------+
| | +Expand(All) | 7 | (anon_4)<-[anon_7:NEXT]-(anon_2) |
6 |
| | | +----+------------------------------------------------------------------------
+----------------+
| | +Filter | 8 | anon_4:Stop |
11 |
| | | +----+------------------------------------------------------------------------
+----------------+
| | +Argument | 9 | anon_4 |
13 |
| | +----+------------------------------------------------------------------------
+----------------+
| +Filter | 10 | a:Stop |
0 |
| | +----+------------------------------------------------------------------------
+----------------+
| +Expand(All) | 11 | (anon_6)<-[anon_5:CALLS_AT]-(a) |
0 |
| | +----+------------------------------------------------------------------------
+----------------+
| +Filter | 12 | anon_6.name = $autostring_1 |
1 |
| | +----+------------------------------------------------------------------------
+----------------+
| +NodeByLabelScan | 13 | anon_6:Station |
10 |
+-------------------+----+------------------------------------------------------------------------
+----------------+

The physical plan produced by slotted runtimes is a one-to-one mapping from the logical plan, where each
logical operator maps to a corresponding physical operator, and where the operators are processed row-
by-row. When using slotted runtime, each variable in the query gets a dedicated “slot”, which the runtime
uses for accessing the data mapped to the given variable, hence the name “slotted”.

858
The slotted runtime uses the traditional execution model of most databases known as the iterator or
“Volcano” model. This is a pull-based process where each operator in the tree “pulls” rows of data from its
child operator by using a virtual call function. In this way, data is pulled up from the bottom of the
execution plan to the top, generating an eruption-like flow of data.

Considerations

The slotted runtime is the first high-performance runtime introduced in Neo4j, replacing the original (and
slower) interpreted runtime, which is now retired.

The slotted runtime is an interpreted runtime, meaning that it interprets the logical plan sent by the planner
operator-by-operator. In general, this is a convenient and flexible approach capable of handling all
operators and queries. The slotted runtime is conceptually similar to interpreted programming languages,
in that it has a shorter planning phase because it does not need to generate all the code for the query
[16]
before execution (unlike compiled runtimes - discussed in more detail below).

In general, users of Neo4j Enterprise Edition should not have to use slotted runtime. However, there are
scenarios where the fast planning phase of the slotted runtime may be useful. For example, if you are
using an application that generates short queries that are not cached (i.e. never, or very rarely, repeated),
then the slotted runtime may be preferable because of its faster planning time.

There are, however, limitations to the slotted runtime. The continuous calling of virtual functions between
each operator uses CPU cycles which results in slower query execution. Furthermore, the iterator model
can lead to poor data locality, which can cause a slower query execution. This is because the process of
individual rows being pulled from different operators makes it difficult to make efficient use of CPU caches.

Pipelined runtime Label—enterprise edition


The pipelined runtime is the default runtime for Neo4j Enterprise Edition. This means that unless users of
Neo4j Enterprise Edition specify a different runtime, queries will be run using the pipelined runtime.

To specify that a query should use the pipelined runtime, prepend the query with CYPHER runtime =
pipelined. For example:

Query

EXPLAIN
CYPHER runtime = pipelined
MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)
((:Stop)-[:NEXT]->(:Stop))+
(a:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })
RETURN count(*)

The resulting execution plan contains notable differences from the one produced by slotted runtime:

859
Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------+----+------------------------------------------------------------------------
+----------------+---------------------+
| Operator | Id | Details |
Estimated Rows | Pipeline |
+-------------------+----+------------------------------------------------------------------------
+----------------+---------------------+
| +ProduceResults | 0 | `count(*)` |
1 | In Pipeline 3 |
| | +----+------------------------------------------------------------------------
+----------------+---------------------+
| +EagerAggregation | 1 | count(*) AS `count(*)` |
1 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Filter | 2 | not anon_1 = anon_5 AND anon_0.name = $autostring_0 AND anon_0:Station |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Expand(All) | 3 | (d)-[anon_1:CALLS_AT]->(anon_0) |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Filter | 4 | d:Stop |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +NullifyMetadata | 14 | |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Repeat(Trail) | 5 | (a) (...){1, *} (d) |
0 | Fused in Pipeline 2 |
| |\ +----+------------------------------------------------------------------------
+----------------+---------------------+
| | +Filter | 6 | isRepeatTrailUnique(anon_7) AND anon_2:Stop |
6 | |
| | | +----+------------------------------------------------------------------------
+----------------+ |
| | +Expand(All) | 7 | (anon_4)<-[anon_7:NEXT]-(anon_2) |
6 | |
| | | +----+------------------------------------------------------------------------
+----------------+ |
| | +Filter | 8 | anon_4:Stop |
11 | |
| | | +----+------------------------------------------------------------------------
+----------------+ |
| | +Argument | 9 | anon_4 |
13 | Fused in Pipeline 1 |
| | +----+------------------------------------------------------------------------
+----------------+---------------------+
| +Filter | 10 | a:Stop |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Expand(All) | 11 | (anon_6)<-[anon_5:CALLS_AT]-(a) |
0 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +Filter | 12 | anon_6.name = $autostring_1 |
1 | |
| | +----+------------------------------------------------------------------------
+----------------+ |
| +NodeByLabelScan | 13 | anon_6:Station |
10 | Fused in Pipeline 0 |
+-------------------+----+------------------------------------------------------------------------
+----------------+---------------------+

860
The rightmost column of the plan shows that it has been divided into four different pipelines. In order to
understand what pipelines are, it is first necessary to understand that queries using pipelined runtime are,
unlike those run in slotted runtime, not executed one row at a time. Rather, the pipelined runtime allows
the physical operators to consume and produce batches of between roughly 100 and 1000 rows each
(referred to as morsels), which are written into buffers containing data and tasks for a pipeline. A pipeline
can, in turn, be defined as a sequence of operators which have been fused into one another so that they
may be executed together in the same task by the runtime.

The logical operators are thus not mapped to a corresponding physical operator when using the pipelined
runtime. Instead, the logical operator tree is transformed into an execution graph containing pipelines and
buffers:

In this execution graph, query execution starts at pipeline 0 which will eventually produce a morsel to be
written into the buffer of pipeline 1. Once there is data for pipeline 1 to process, it can begin executing
and in turn write data for the next pipeline to process, and so on. In this way, data is being pushed along
the execution graph.

Considerations

The pipelined runtime is a push-based execution model, where data is pushed from the leaf operator to its
parent operators. Unlike pull-based models (which the slotted runtime uses), data can be kept in local
variables when using push-based execution models, and this has several benefits; it enables direct use of
CPU registers, improves the use of CPU caches, and avoids the costly virtual function calls used in pull-
based models.

The pipelined runtime is ideal for transactional use cases, with a large number of queries running in parallel
on the system. This covers most usage scenarios, and for this reason, it is the default Neo4j runtime.

The pipelined runtime is a combined model, that can either use an interpreted or compiled runtime.
However, because it predominantly uses the latter, it is considered a compiled runtime. Unlike interpreted
runtimes, compiled runtimes have a code generation phase followed by an execution phase, and this
typically causes a longer query planning time, but a shorter execution time.

As stated above, there are rare scenarios in which users of Neo4j Enterprise Edition may benefit from not
using the pipelined runtime for their queries. However, for most queries, the pipelined runtime is a more
efficient runtime capable of handling all operators and queries.

861
Parallel runtime label:label—enterprise-edition label—new-5.13[]
Both the slotted and pipelined runtimes execute queries in a single thread assigned to one CPU core. It is
still possible to achieve parallelism (broadly defined as when two or more sets of operations can be
processed concurrently within a single database environment) when using these two runtimes by running
multiple queries in separate CPU threads concurrently (this is the typical scenario in OLTP (Online
Transaction Processing) use cases). Another alternative is to run multiple transactions concurrently within
the same query using CALL {…} IN CONCURRENT TRANSACTIONS.

However, there are scenarios, principally when performing graph analytics, where it is beneficial for a
single query to use several cores to boost its performance. This can be achieved by using parallel runtime,
which is multi-threaded and allows queries to potentially utilize all available cores on the server running
Neo4j.

To specify that a query should use the parallel runtime, prepend it with CYPHER runtime = parallel. For
example:

Query

EXPLAIN
CYPHER runtime = parallel
MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)
((:Stop)-[:NEXT]->(:Stop))+
(a:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })
RETURN count(*)

This is the resulting execution plan:

Planner COST

Runtime PARALLEL

Runtime version 5.25

Batch size 128

+-----------------------------+----
+------------------------------------------------------------------------+----------------
+---------------------+
| Operator | Id | Details
| Estimated Rows | Pipeline |
+-----------------------------+----
+------------------------------------------------------------------------+----------------
+---------------------+
| +ProduceResults | 0 | `count(*)`
| 1 | In Pipeline 6 |
| | +----
+------------------------------------------------------------------------+----------------
+---------------------+
| +EagerAggregation | 1 | count(*) AS `count(*)`
| 1 | |
| | +----
+------------------------------------------------------------------------+----------------+
|
| +Filter | 2 | NOT anon_1 = anon_5 AND anon_0.name = $autostring_0 AND
anon_0:Station | 0 | |
| | +----
+------------------------------------------------------------------------+----------------+
|
| +Expand(All) | 3 | (d)-[anon_1:CALLS_AT]->(anon_0)
| 0 | Fused in Pipeline 5 |
| | +----
+------------------------------------------------------------------------+----------------
+---------------------+
| +Filter | 4 | d:Stop

862
| 0 | |
| | +----
+------------------------------------------------------------------------+----------------+
|
| +NullifyMetadata | 14 |
| 0 | |
| | +----
+------------------------------------------------------------------------+----------------+
|
| +Repeat(Trail) | 5 | (a) (...){1, *} (d)
| 0 | Fused in Pipeline 4 |
| |\ +----
+------------------------------------------------------------------------+----------------
+---------------------+
| | +Filter | 6 | isRepeatTrailUnique(anon_8) AND anon_7:Stop
| 6 | |
| | | +----
+------------------------------------------------------------------------+----------------+
|
| | +Expand(All) | 7 | (anon_9)<-[anon_8:NEXT]-(anon_7)
| 6 | Fused in Pipeline 3 |
| | | +----
+------------------------------------------------------------------------+----------------
+---------------------+
| | +Filter | 8 | anon_9:Stop
| 11 | |
| | | +----
+------------------------------------------------------------------------+----------------+
|
| | +Argument | 9 | anon_9
| 13 | Fused in Pipeline 2 |
| | +----
+------------------------------------------------------------------------+----------------
+---------------------+
| +Filter | 10 | a:Stop
| 0 | |
| | +----
+------------------------------------------------------------------------+----------------+
|
| +Expand(All) | 11 | (anon_6)<-[anon_5:CALLS_AT]-(a)
| 0 | Fused in Pipeline 1 |
| | +----
+------------------------------------------------------------------------+----------------
+---------------------+
| +Filter | 12 | anon_6.name = $autostring_1
| 1 | |
| | +----
+------------------------------------------------------------------------+----------------+
|
| +PartitionedNodeByLabelScan | 13 | anon_6:Station
| 10 | Fused in Pipeline 0 |
+-----------------------------+----
+------------------------------------------------------------------------+----------------
+---------------------+

A key difference between the physical plans produced by the parallel runtime compared to those
generated by pipelined runtime is that, in general, more pipelines are produced when using the parallel
runtime (in this case, seven instead of the four produced by the same query being run on pipelined
runtime). This is because, when executing a query in the parallel runtime, it is more efficient to have more
tasks that can be run in parallel, whereas when running a single-threaded execution in the pipelined
runtime it is more efficient to fuse several pipelines together.

Another important difference is that the parallel runtime uses partitioned operators
(PartitionedNodeByLabelScan in this case). These operators first segment the retrieved data and then
operate on each segment in parallel.

The parallel runtime shares the same architecture as the pipelined runtime, meaning that it will transform
the logical plan into the same type of execution graph as described above. However, when using parallel
runtime, each pipeline task can be executed in a separate thread. Another similarity with pipelined runtime

863
is that queries run on the parallel runtime will begin by generating the first pipeline which eventually will
produce a morsel in the input buffer of the subsequent pipeline. But, whereas only one pipeline can
progress at a time when using the pipelined runtime, parallel runtime allows pipelines to concurrently
produce morsels. Therefore, as each task finishes, more and more input morsels will be made available for
the tasks which means that more and more workers can be utilized to execute the query.

To further explain how parallel runtime works, a set of new terms need to be defined:

• Worker: a thread that executes work units to evaluate incoming queries.

• Task: a unit of work. A task executes one pipeline on one input morsel and produces one output
morsel. If any condition prevents a task from completing, it can be rescheduled as a Continuation to
resume at a later time.

• Continuation: a task that did not finish execution and must be scheduled again.

• Scheduler: responsible for deciding which unit of work to process next. Scheduling is decentralized,
and each worker has its own scheduler instance.

Consider the execution graph below, based on the same example query:

The execution graph shows that execution starts at pipeline 0, which consists of the operator
PartitionedNodeByLabelScan and can be executed simultaneously on all available threads working on
different morsels of data. Once pipeline 0 has produced at least one full morsel of data, any thread can
then start executing pipeline 1, while other threads may continue to execute pipeline 0. More
specifically, once there is data from a pipeline, the scheduler can proceed to the next pipeline while
concurrently executing earlier pipelines. In this case, pipeline 5 ends with an aggregation (performed by
the EagerAggregation operator), which means that the last pipeline (6) cannot start until all preceding
pipelines are completely finished for all the preceding morsels of data.

864
Considerations

When to use the parallel runtime


In most situations where multiple CPU cores are available, long-running queries can be expected to run
significantly faster on the parallel runtime. While it is not possible to define the exact duration at which a
query would benefit from being run on the parallel runtime (as this depends on the data model, the query
structure, the load of the system, and the number of cores available), it can be assumed as a general rule
that any query that takes longer than approximately 500 milliseconds would be a good candidate.

This means that the parallel runtime is suitable for analytical, graph-global queries. These queries are
often not anchored to a particular start node and therefore process a large section of the graph in order to
gain valuable insights from it.

However, queries that start with anchoring a specific node may benefit from being run on the parallel
runtime, if either of the following is true:

• The anchored starting node is a densely connected node or super node.

• The query proceeds to expand from the anchored node to a large section of the graph.

There is, therefore, no fixed rule as to when a query should be run with the parallel runtime, but these
guidelines provide some useful information about the scenarios when users would very likely benefit from
trying to use it.

When not to use the parallel runtime


Unlike the pipelined runtime, which was designed as the most efficient method for most queries to be
planned, the use cases for the parallel runtime are more specific, and there are situations where it is not
possible or beneficial to use it. Most notably, the parallel runtime only supports read queries. It also does
not support procedures functions that are not considered thread-safe (i.e. not safe to run from multiple
threads).

Moreover, not all queries will run faster by using the parallel runtime. For example, a graph-local query that
starts with anchoring a node and proceeds to only match a small portion of the graph will probably not run
any faster on the parallel runtime (it may even run slower when executed with the parallel runtime,
because of its scheduling and the additional book-keeping required for executing a query on multiple
threads). As a general rule of thumb, the parallel runtime is probably not beneficial for queries which take
less than half a second to complete.

The parallel runtime may also perform worse than the pipelined runtime for queries that contain
subclauses where ORDER BY is used to order a property that is indexed. This is because the parallel
runtime is unable to take advantage of property indexes for ordering, and therefore must re-sort the
aggregated results on the selected properties before returning any results.

Finally, though individual queries may run faster when running the parallel runtime, the overall throughput
of the database may decrease as a result of running many concurrent queries.

The parallel runtime is accordingly not suitable for transactional processing queries with high throughput
workloads. It is, however, ideal for analytical use cases where the database runs relatively few, but
demanding read queries.

865
Overview

In general, the parallel runtime should be considered if the following conditions are met:

• Graph-global read-queries are constructed to target a large section of a graph.

• The speed of queries is important.

• The server has many CPUs and enough memory.

• There is a low concurrency workload on the database.

For more information about the parallel runtime, including more details about queries, procedures,
functions, configuration settings, and using the parallel runtime on Aura, see the Parallel runtime: reference
page.

Summary
The below table summarizes the most important distinctions between the three different runtimes
available in Cypher:

Slotted Pipelined Parallel

Execution model Pull Push Push

Physical operator Row-by-row Batched Batched


consumption

Processor threads Single-threaded Single-threaded Multi-threaded

Runtime type Interpreted Compiled or interpreted Compiled or interpreted

Supported query type Read and write Read and write Read only

Parallel runtime: reference


The parallel runtime behaves differently compared to the slotted and pipelined runtimes in several regards.
This page explains the relevant configuration settings for the parallel runtime and the scenarios in which it
is either not supported or considered thread-safe. It also includes relevant information for Neo4j Aura
users.

Readers not familiar with the parallel runtime are encouraged to read about the parallel runtime concepts
before reading this page.

Updating queries
The parallel runtime will throw an error if a query attempts to update the graph. For example, any query
run on the parallel runtime which uses the CREATE clause, will throw the following error:

Error message

The parallel runtime does not support updating queries. Please use another runtime.

For a full list of all available Cypher write clauses, see the Clauses overview page.

866
Transactions
It is not possible to use the parallel runtime if a change has been made to the state of a transaction.

For example, the following transaction (initiated on Cypher Shell) will be rolled back, because executing a
Cypher query will make changes to the state of a transaction.

Step 1: Initiate a new transaction and change its state by creating a node

:begin
CREATE (n:Person)

Step 2: Attempt to execute a Cypher query with the parallel runtime on the existing transaction

CYPHER runtime = parallel


RETURN 42

Error message

An error occurred while in an open transaction. The transaction will be rolled back and terminated. Error:
The parallel runtime is not supported if there are changes in the transaction state. Use another runtime.

For more information about transactions in Neo4j, see the Operations Manual → Transaction management.

Configuration settings
The following setting can be configured to modify the behavior of the parallel runtime:

server.cypher.parallel.worker_limit

Description Number of threads to allocate to Cypher worker threads for the parallel runtime. If
set to a positive number, that number of workers will be started. If set to 0, one
worker will be started for every logical processor available to the Java Virtual
Machine.

If set to a negative number we will subtract the value from the number of logical
processors available; for example, say Neo4j is running on a server with 16 available
processors, using server.cypher.parallel.worker_limit = -1 would then mean
that 15 threads are available for the parallel runtime.

Valid values Integer

Default value 0

Setting server.cypher.parallel.worker_limit to a negative number -n where n is greater than the total


number of cores will disable the parallel runtime.

For more information about configuration settings in Neo4j, see the Operations Manual → Configuration.

867
Aura
The parallel runtime is available on all non-free AuraDB instances, regardless of their size or CPU count.
Additionally, when a query is run with parallel runtime on an Aura instance, it can utilize up to the total
number of available CPUs.

The parallel runtime is disabled on AuraDB Free instances. Attempting to run a query with parallel runtime
on AuraDB Free will throw the following error message:

Error message

Parallel runtime has been disabled, please enable it or upgrade to a bigger Aura instance.

Users of AuraDB Professional, AuraDB Business Critical, and AuraDB Virtual Dedicated
Cloud select the the size and the number of available CPUs when creating an instance.
 More information about the various tiers of AuraDB can be found on the Neo4j Pricing
page.

Procedures and functions


Procedures and functions that read the database are supported by the parallel runtime. Apart from this,
there are two categories of procedures and functions to keep in mind when using the parallel runtime.

The first can be categorized as updating procedures. These are procedures that update the graph with
write queries, such as the Neo4j procedures db.createProperty. If such procedures are called in a query run
on the parallel runtime, the query will fail.

The second can be categorized as non-thread-safe procedures and functions. These are procedures and
functions which perform tasks that are not protected against multiple worker threads concurrently
interacting with the targeted data. This includes procedures and functions that perform either of the
following tasks:

• Executes a Cypher query (because that will start a new transaction, which is not supported by the
parallel runtime).

• Starts a new transaction (because this is not supported by the parallel runtime).

Calling procedures which perform any of these tasks in a query run on the parallel runtime will not fail the
query. Instead the query will automatically run on the pipelined runtime.

Neo4j procedures

The following Neo4j procedures are not considered thread-safe and cannot be run on the parallel runtime.
Trying to call them in a query run on the parallel runtime will not fail the query. Instead the query will
automatically run on the pipelined runtime.

Non-thread-safe Neo4j procedures

868
Procedure

db.awaitIndex

db.awaitIndexes

db.checkpoint

db.info

db.labels

db.listLocks

db.ping

db.propertyKeys

db.prepareForReplanning

db.relationshipTypes

db.resampleIndex

db.resampleOutdatedIndexes

db.schema.nodeTypeProperties

db.schema.relTypeProperties

db.schema.visualization

dbms.checkConfigValue

dbms.listActiveLocks

dbms.listPools

dbms.scheduler.failedJobs

dbms.scheduler.groups

dbms.scheduler.jobs

dbms.upgrade

dbms.upgradeStatus

APOC

The APOC library contains procedures and functions which extend the use of Cypher. There are a number
of APOC procedures and functions that are not considered thread-safe, and cannot be run on the parallel
runtime. For information about these, refer to the pages of the individual procedures and functions in the
APOC Manual.

User-defined functions

User-defined functions are simpler forms of procedures that return a single value and are read-only. To
learn more about user-defined functions in Neo4j, see the Java Reference Manual → User-defined
functions.

Similar to Neo4j and APOC procedures, any user-defined function that starts a new transaction by
executing a Cypher query is not considered thread-safe and will not be supported by the parallel runtime

869
(this includes all user-defined aggregating functions).

For example, consider the two following user-defined functions:

class MyFunctions {
@Context
public Transaction transaction;

@UserFunction("examples.return42")
public long return42() {
return 42L;
}

@UserFunction("examples.return42ViaCypher")
public long return42ViaCypher() {
return (long) transaction.execute("RETURN 42 AS res").next().get("n);
}
}

Running examples.return42() will succeed with the parallel runtime, whereas


examples.return42ViaCypher() will fail because executing a new Cypher query will start a new
transaction.

However, if @NotThreadSafe is added to the method, then the query will automatically not run on the
parallel runtime. The query will instead default to the single-threaded pipelined runtime and generate a
notification.

Calling the below user-defined function would, therefore, not fail with the parallel runtime. Instead, the
Cypher query would automatically be run on the pipelined runtime.

class MyFunctions {
@Context
public Transaction transaction;

@UserFunction("examples.return42ViaCypher")
@NotThreadSafe
public long return42ViaCypher() {
return (long) transaction.execute("RETURN 42 AS res").next().get("n);
}
}

Query tuning
Neo4j aims to execute queries as fast as possible. However, when optimizing for maximum query
execution performance, it may be helpful to rephrase queries using knowledge about the domain and the
application.

This page contains information about how to tune queries using different strategies.

For information about changing the runtime of a query, see the page about Cypher runtime concepts.

General recommendations
The overall goal of manual query performance optimization is to ensure that only necessary data is
retrieved from the graph.

870
Queries should aim to filter data as early as possible in order to reduce the amount of work that has to be
done in the later stages of query execution. This also applies to what gets returned: returning whole nodes
and relationships ought to be avoided in favour of selecting and returning only the data that is needed. You
should also make sure to set an upper limit on variable length patterns, so they don’t cover larger portions
of the dataset than needed.

Each Cypher query gets optimized and transformed into an execution plan by the Cypher query planner.
To minimize the resources used for this, try to use parameters instead of literals when possible. This allows
Cypher to re-use queries instead of having to parse and build new execution plans.

To read more about the execution plan operators mentioned in this section, see Operators.

Query options
Query execution can be fine-tuned through the use of query options.

In order to use one or more of these options, the query must be prepended with CYPHER, followed by the
query option(s), as exemplified thus:

CYPHER query-option [further-query-options] query

For information about the various runtimes available in Cypher, see Cypher runtimes.

Cypher planner
The Cypher planner takes a Cypher query and computes an execution plan that solves it. For any given
query there is likely a number of execution plan candidates that each solve the query in a different way.
The planner uses a search algorithm to find the execution plan with the lowest estimated execution cost.

This table describes the available planner options:

Query option Description Default

planner=cost Use cost based planning with default limits on plan 


search space and time.

planner=idp Synonym for planner=cost.

planner=dp Use cost based planning without limits on plan


search space and time to perform an exhaustive
search for the best execution plan.

Using this option can significantly

 increase the planning time of the


query.

871
Cypher connect-components planner Label—deprecated
One part of the Cypher planner is responsible for combining sub-plans for separate patterns into larger
plans - a task referred to as connecting components.

This table describes the available query options for the connect-components planner:

Query option Description Default

connectComponentsPlanner=greedy Use a greedy approach when combining sub-plans.

Using this option can significantly

 reduce the planning time of the


query.

connectComponentsPlanner=idp Use the cost based IDP search algorithm when 


combining sub-plans.

Using this option can significantly


increase the planning time of the
 query but usually finds better
plans.

The Cypher query option connectComponentsPlanner is deprecated and will be removed

 without a replacement. The product’s default behavior of using a cost-based IDP search
algorithm when combining sub-plans will be kept.

Cypher update strategy


This option affects the eagerness of updating queries.

The possible values are:

Query option Description Default

updateStrategy=default Update queries are executed eagerly when needed. 

updateStrategy=eager Update queries are always executed eagerly.

Cypher expression engine


This option affects how the runtime evaluates expressions.

The possible values are:

872
Query option Description Default

expressionEngine=default Compile expressions and use the compiled 


expression engine when needed.

expressionEngine=interpreted Always use the interpreted expression engine.

expressionEngine=compiled Always compile expressions and use the compiled


expression engine.

Cypher operator engine


This query option affects whether the pipelined runtime attempts to generate compiled code for groups of
operators.

The possible values are:

Query option Description Default

operatorEngine=default Attempt to generate compiled operators when 


applicable.

operatorEngine=interpreted Never attempt to generate compiled operators.

operatorEngine=compiled Always attempt to generate compiled operators.

Cannot be used together with runtime=slotted.

Cypher interpreted pipes fallback


This query option affects how the pipelined runtime behaves for operators it does not directly support.

The available options are:

Query option Description Default

interpretedPipesFallback=default Equivalent to 
interpretedPipesFallback=whitelisted_plans_onl
y.

interpretedPipesFallback=disabled If the plan contains any operators not supported by


the pipelined runtime then another runtime is
chosen to execute the entire plan.

Cannot be used together with runtime=slotted.

873
Query option Description Default

interpretedPipesFallback=whitelisted Parts of the execution plan can be executed on


_plans_only
another runtime. Only certain operators are allowed
to execute on another runtime.

Cannot be used together with runtime=slotted.

interpretedPipesFallback=all Parts of the execution plan may be executed on


another runtime. Any operator is allowed to execute
on another runtime. Queries with this option set
might produce incorrect results, or fail.

Cannot be used together with or runtime=slotted.

This setting is experimental, and

 using it in a production
environment is discouraged.

Cypher replanning
Cypher replanning occurs in the following circumstances:

• When the query is not in the cache. This can either be when the server is first started or restarted, if
the cache has recently been cleared, or if server.db.query_cache_size was exceeded.

• When the time has past the dbms.cypher.statistics_divergence_threshold value.

There may be situations where Cypher query planning can occur at a non-ideal time. For example, when a
query must be as fast as possible and a valid plan is already in place.

Replanning is not performed for all queries at once; it is performed in the same thread as

 running the query, and can block the query. However, replanning one query does not
replan any other queries.

There are three different replan options available:

Option Description Default

replan=default This is the planning and replanning option as 


described above.

replan=force This will force a replan, even if the plan is valid


according to the planning rules. Once the new plan
is complete, it replaces the existing one in the query
cache.

874
Option Description Default

replan=skip If a valid plan already exists, it will be used even if


the planning rules would normally dictate that it
should be replanned.

The replan option is prepended to queries.

For example:

CYPHER replan=force MATCH ...

In a mixed workload, you can force replanning by using the Cypher EXPLAIN commands. This can be useful
to schedule replanning of queries which are expensive to plan, at known times of low load. Using EXPLAIN
will make sure the query is only planned, but not executed.

For example:

CYPHER replan=force EXPLAIN MATCH ...

During times of known high load, replan=skip can be useful to not introduce unwanted latency spikes.

Cypher infer schema parts Label—new 5.21


For some queries, the planner can infer predicates such as labels or types from the graph structure,
thereby enhancing its ability to estimate the number of rows each operator will produce. (See
Understanding execution plans - Reading execution plans for more information about the role of operators
and estimated row counts in query execution plans.) The option inferSchemaParts controls the extent to
which the planner should infer predicates.

Option Description

inferSchemaParts=off No predicates are inferred.

inferSchemaParts=most_selective_label Relationship types are used to infer labels on connected


nodes. The label corresponding to the smallest number of
nodes is used to estimate rows. Avoiding the inference of
multiple labels improves accuracy for nodes with several
dependent labels, such as every :Actor being a :Person.

If this query option is not provided, then the value set in Operations Manual → Configuration settings →
dbms.cypher.infer_schema_parts will be used.

[13] The relevant information about the current state of the database includes which indexes and constraints are available,
as well as various statistics maintained by the database. The Cypher planner uses this information to determine which
access patterns will produce the best execution plan.

875
[14] The format of the execution plans displayed in this section are those generated when using Cypher Shell. The execution
plans generated by Neo4j Browser use a different format.

[15] The statistical information maintained by Neo4j includes the following: the number of nodes having a certain label, the
number of relationships by type, selectivity per index, and the number of relationships by type, ending with or starting from a
node with a specific label.

[16] The classification of a runtime as interpreted or compiled is not entirely accurate. Most runtime implementations are not
fully interpreted or fully compiled but are rather a blend of the two styles. For example, when the slotted runtime is run in
Neo4j Enterprise Edition, code is generated for the expressions included in the query. Nevertheless, the slotted runtime is
considered interpreted, since that is the predominant method of implementation.

876
Query caches
Out of the box, the set of query caches is per database. That means that a new set of caches is initialized
for each new database.

The maximum number of entries per cache is configured using


server.memory.query_cache.per_db_cache_num_entries. It determines the cache size only when
server.memory.query_cache.sharing_enabled is set to false.

Query caches may consume a lot of memory, especially when running many active databases. To tackle
this and improve predictability on memory consumption, you can configure the DBMS to use only one set
of caches for all databases. For more information, see Unifying query caches.

Configure caches
The following is a summary of the query cache configurations. For more information, see Operations
Manual → Configuration settings.

Query cache configurations

Setting Description Default

server.memory.query_cache.sharing_enabled Enterprise edition Enable sharing false

cache space between different


databases. With this option turned on,
databases will share cache space, but
not cache entries.

server.memory.query_cache.shared_cache_num_entries Enterprise edition The number of 1000

cached queries for all databases. This


setting is only deciding cache size
when
server.memory.query_cache.sharing_
enabled is set to true.

server.memory.query_cache.per_db_cache_num_entries The number of cached queries per 1000

database. This setting is only deciding


cache size when
server.memory.query_cache.sharing_
enabled is set to false.

Unifying query caches


To enable the unified query caches, set the option server.memory.query_cache.sharing_enabled=true.

Unified query cache configurations

877
Setting Description Default

server.memory.query_cache.sharing_enabled Enterprise edition Enable sharing false

cache space between different


databases. With this option turned on,
databases will share cache space, but
not cache entries.

server.memory.query_cache.shared_cache_num_entries Enterprise edition The number of 1000

cached queries for all databases. This


setting is only deciding cache size
when
server.memory.query_cache.sharing_
enabled is set to true.

When this feature is enabled, all databases use only one set of query caches. A database may store and
retrieve entries from the shared cache, but it may not retrieve entries produced by another database.

While databases use the same set of caches, a database may not observe entries
 originating from other databases.

The database may, however, evict entries from other databases as necessary, according to the constrained
cache size and cache eviction policy. In essence, databases may compete for cache space, but may not
observe each other’s entries.

When this option is turned on, the cache space available to all databases is configured using the setting
server.memory.query_cache.shared_cache_num_entries.

878
Administration
The pages previously in this chapter have been moved to the Operations Manual.

More specific information about the content relocation is listed in the table:

Content New location in Operations Manual

Database management Database administration

Alias management Managing aliases

Server management Managing servers in a cluster

Access control Authentication and authorization

879
Syntax
Cypher follows a several syntactical rules and recommendations that are important to know when
constructing queries. Further information can be found in the following sections:

• Parsing

• Naming rules and recommendations

• Variables

• Reserved keywords

• Parameters

• Operators

• Comments

Parsing
This page provides a general overview of how Cypher parses an input STRING.

The Cypher parser takes an arbitrary input STRING. While the syntax of Cypher is described in subsequent
chapters, the following details the general rules on which characters are considered valid input.

Using unicodes in Cypher


Unicodes can generally be escaped as \uxxx.

Additional documentation on escaping rules for STRING literals, names and regular expressions can be
found here:

• String literal escape sequences

• Using special characters in names

• Regular expressions

The following example escapes the unicode character A (\u0041) in the keyword MATCH:

M\u0041TCH (m) RETURN m;

The Unicode version used by Cypher depends on the running JVM version.

Neo4j version JVM compliancy Unicode version

3.x Java SE 8 Platform Specification Unicode 6.2

4.x Java SE 11 Platform Specification Unicode 10.0

5.x Java SE 17 Platform Specification Unicode 13.0

880
Neo4j version JVM compliancy Unicode version

5.14 Java SE 17 and Java SE 21 Platform Unicode 13.0 and Unicode 15.0
Specification

Supported whitespace
Whitespace can be used as a separator between keywords and has no semantic meaning. The following
unicode characters are considered as whitespace:

Description List of included Unicode characters

Unicode general category Zp \u2029

Unicode general category Zs \u0020 (space), \u1680, \u2000-200A, \u202F, \u205F, \u3000

Unicode general category class Zl \u2028

Horizontal tabulation \t, \u0009

Line feed \n, \u000A

Vertical tabulation \u000B

Form feed \f, \u000C

Carriage return \r, \u000D

File separator \u001C

Group separator \u001D

Record separator \u001E

Unit separator \u001F

It is possible to have multiple whitespace characters in a row, and will have the same effect as using a
single whitespace.

The following example query uses vertical tabulation (\u000B) as whitespace between the RETURN keyword
and the variable m:

MATCH (m) RETURN\u000Bm;

Supported newline characters


A newline character identifies a new line in the query and is also considered whitespace. The supported
newline characters in Cypher are:

Description List of included Unicode characters

Line feed \n, \u000A

Carriage return \r, \u000D

Carriage return + line feed \r\n, \u000D\u000A

881
Naming rules and recommendations
This page describes rules and recommendations for the naming of node labels, relationship types, property
names, variables, indexes, and constraints.

Naming rules
• Alphabetic characters:
◦ Names should begin with an alphabetic character.

◦ This includes "non-English" characters, such as å, ä, ö, ü etc.

• Numbers:
◦ Names should not begin with a number.

◦ To illustrate, 1first is not allowed, whereas first1 is allowed.

• Symbols:
◦ Names should not contain symbols, except for underscore, as in my_variable, or $ as the first
character to denote a parameter, as given by $myParam.

• Length:
◦ Can be very long, up to 65535 (2^16 - 1) or 65534 characters, depending on the version of Neo4j.

• Case-sensitive:
◦ Names are case-sensitive and thus, :PERSON, :Person and :person are three different labels, and n
and N are two different variables.

• Whitespace characters:
◦ Leading and trailing whitespace characters will be removed automatically. For example, MATCH ( a
) RETURN a is equivalent to MATCH (a) RETURN a.

Using special characters in names


Non-alphabetic characters, including numbers, symbols and whitespace characters, can be used in names,
but must be escaped using backticks. For example: `^n`, `1first`, `$$n`, and `my variable has spaces`.
Database names are an exception and may include dots without the need for escaping. For example:
naming a database foo.bar.baz is perfectly valid.

Within an escaped name, the following escaping sequences are allowed:

Escape sequence Character

`` Backtick

\uxxxx Unicode UTF-16 code point (4 hex digits must follow the \u)

882
Using escaped names with unsanitized user input makes you vulnerable to Cypher
injection. Some techniques to mitigate this are:

 • sanitizing (and validating) the user input.

• remodeling your data model to avoid this data access pattern.

Several special characters have been deprecated and will require escaping in the next

 major release of Neo4j. For the comprehensive list of deprecated characters, see the
deprecations page.

Scoping and namespace rules


• Node labels, relationship types and property names may re-use names.
◦ The following query — with a for the label, type and property name — is valid: CREATE (a:a {a:
'a'})-[r:a]->(b:a {a: 'a'}).

• Variables for nodes and relationships must not re-use names within the same query scope.
◦ The following query is not valid as the node and relationship both have the name a: CREATE (a)-
[a]->(b).

Recommendations
Here are the recommended naming conventions:

Node labels Camel-case, beginning with an upper- :VehicleOwner rather than


case character :vehicle_owner etc.

Relationship types Upper-case, using underscore to :OWNS_VEHICLE rather than :ownsVehicle


separate words etc.

Length limit of identifiers Label—new 5.25


Neo4j’s GQL’s limit on the maximum length of identifiers.

The maximum limit is set to 16,383 characters in an identifier. This means that node labels, relationship
types, and property keys cannot include more than 16,383 characters.

Variables
This page provides an overview of variables in Cypher.

When you reference parts of a pattern or a query, you do so by naming them. The names you give the
different parts are called variables.

In this example:

883
MATCH (n)-->(b)
RETURN b

The variables are n and b.

Information regarding the naming of variables may be found here.

Variables are only visible in the same query part


Variables are not carried over to subsequent queries. If multiple query parts are chained
 together using WITH, variables have to be listed in the WITH clause to be carried over to
the next part. For more information see WITH.

Reserved keywords
This page contains a list of reserved keywords in Cypher.

Reserved keywords are words that have a special meaning in Cypher. The listing of the reserved keywords
are grouped by the categories from which they are drawn. In addition to this, there are a number of
keywords that are reserved for future use.

The reserved keywords are not permitted to be used as identifiers in the following contexts:

• Variables

• Function names

• Parameters

If any reserved keyword is escaped — i.e. is encapsulated by backticks `, such as `AND` — it would become
a valid identifier in the above contexts.

Clauses
• CALL

• CREATE

• DELETE

• DETACH

• FOREACH

• LOAD

• MATCH

• MERGE

• OPTIONAL

• REMOVE

• RETURN

• SET

884
• START

• UNION

• UNWIND

• WITH

Subclauses
• LIMIT

• ORDER

• SKIP

• WHERE

• YIELD

Modifiers
• ASC

• ASCENDING

• ASSERT

• BY

• CSV

• DESC

• DESCENDING

• ON

Expressions
• ALL

• CASE

• COUNT

• ELSE

• END

• EXISTS

• THEN

• WHEN

Operators
• AND

• AS

885
• CONTAINS

• DISTINCT

• ENDS

• IN

• IS

• NOT

• OR

• STARTS

• XOR

Schema
• CONSTRAINT

• CREATE

• DROP

• EXISTS

• INDEX

• NODE

• KEY

• UNIQUE

Hints
• INDEX

• JOIN

• SCAN

• USING

Literals
• false

• null

• true

Reserved for future use


• ADD

• DO

• FOR

886
• MANDATORY

• OF

• REQUIRE

• SCALAR

Parameters
This page describes parameterized querying.

Introduction
Cypher supports querying with parameters. A parameterized query is a query in which placeholders are
used for parameters and the parameter values are supplied at execution time. This means developers do
not have to resort to string building to create a query. Additionally, parameters make caching of execution
plans much easier for Cypher, thus leading to faster query execution times.

Parameters can be used for:

• literals and expressions

• node and relationship ids

Parameters cannot be used for the following constructs, as these form part of the query structure that is
compiled into a query plan:

• property keys; so MATCH (n) WHERE n.$param = 'something' is invalid

• relationship types; so MATCH (n)-[:$param]→(m) is invalid

• labels; so MATCH (n:$param) is invalid

Parameters may consist of letters and numbers, and any combination of these, but cannot start with a
number or a currency symbol.

Setting parameters when running a query is dependent on the client environment. For example:

• To set a parameter in Cypher Shell use :param name => 'Joe'. For more information refer to
Operations Manual → Cypher Shell - Query Parameters.

• For Neo4j Browser use the same syntax as Cypher Shell, :param name => 'Joe'.

• When using drivers, the syntax is dependent on the language choice. See the examples in
Transactions in the Neo4j Driver manuals.

• For usage via the Neo4j HTTP API, see the HTTP API documentation.

We provide below a comprehensive list of examples of parameter usage. In these examples, parameters
are given in JSON; the exact manner in which they are to be submitted depends upon the driver being
used.

887
Auto-parameterization
From Neo4j 5 onwards, even when a query is partially parameterized, Cypher will try to infer parameters
anyway. Each literal in the query is replaced with a parameter. This increases the re-usability of the
computed plan for queries that are identical except for the literals. It is not recommended to rely on this
behavior - users should rather use parameters where they think it is appropriate.

String literal
Parameters

{
"name": "Johan"
}

Query

MATCH (n:Person)
WHERE n.name = $name
RETURN n

You can use parameters in this syntax as well:

Parameters

{
"name": "Johan"
}

Query

MATCH (n:Person {name: $name})


RETURN n

Regular expression
Parameters

{
"regex": ".*h.*"
}

Query

MATCH (n:Person)
WHERE n.name =~ $regex
RETURN n.name

Case-sensitive STRING pattern matching

888
Parameters

{
"name": "Michael"
}

Query

MATCH (n:Person)
WHERE n.name STARTS WITH $name
RETURN n.name

Create node with properties


Parameters

{
"props": {
"name": "Andy",
"position": "Developer"
}
}

Query

CREATE ($props)

Create multiple nodes with properties


Parameters

{
"props": [ {
"awesome": true,
"name": "Andy",
"position": "Developer"
}, {
"children": 3,
"name": "Michael",
"position": "Developer"
} ]
}

Query

UNWIND $props AS properties


CREATE (n:Person)
SET n = properties
RETURN n

Setting all properties on a node


Note that this will replace all the current properties.

889
Parameters

{
"props": {
"name": "Andy",
"position": "Developer"
}
}

Query

MATCH (n:Person)
WHERE n.name = 'Michaela'
SET n = $props

SKIP and LIMIT


Parameters

{
"s": 1,
"l": 1
}

Query

MATCH (n:Person)
RETURN n.name
SKIP $s
LIMIT $l

Node id
Parameters

{
"id" : "4:1fd57deb-355d-47bb-a80a-d39ac2d2bcdb:0"
}

Query

MATCH (n)
WHERE elementId(n) = $id
RETURN n.name

Multiple node ids


Parameters

{
"ids" : [ "4:1fd57deb-355d-47bb-a80a-d39ac2d2bcdb:0", "4:1fd57deb-355d-47bb-a80a-d39ac2d2bcdb:1" ]
}

890
Query

MATCH (n)
WHERE elementId(n) IN $ids
RETURN n.name

Calling procedures
Parameters

{
"indexname" : "My_index"
}

Query

CALL db.resampleIndex($indexname)

Operators
This page contains an overview of the available Cypher operators.

Operators at a glance
Aggregation operators DISTINCT

Property operators . for static property access, [] for dynamic property access, =
for replacing all properties, += for mutating specific properties

Mathematical operators +, -, *, /, %, ^

Comparison operators =, <>, <, >, <=, >=, IS NULL, IS NOT NULL

STRING-specific comparison operators STARTS WITH, ENDS WITH, CONTAINS, =~ (regex matching)

Boolean operators AND, OR, XOR, NOT

String operators + and || (string concatenation), IS NORMALIZED

Temporal operators + and - for operations between durations and temporal


instants/durations, * and / for operations between durations
and numbers

Map operators . for static value access by key, [] for dynamic value access
by key

List operators + and || (list concatenation), IN to check existence of an


element in a list, [] for accessing element(s) dynamically

Aggregation operators
The aggregation operators comprise:

• remove duplicates values: DISTINCT

891
Using the DISTINCT operator
Retrieve the unique eye colors from Person nodes.

Query

CREATE
(a:Person {name: 'Anne', eyeColor: 'blue'}),
(b:Person {name: 'Bill', eyeColor: 'brown'}),
(c:Person {name: 'Carol', eyeColor: 'blue'})
WITH [a, b, c] AS ps
UNWIND ps AS p
RETURN DISTINCT p.eyeColor

Even though both 'Anne' and 'Carol' have blue eyes, 'blue' is only returned once.

Result

p.eyeColor

"blue"

"brown"

Rows: 2
Nodes created: 3
Properties set: 6
Labels added: 3

DISTINCT is commonly used in conjunction with aggregating functions.

Property operators
The property operators pertain to a node or a relationship, and comprise:

• statically access the property of a node or relationship using the dot operator: .

• dynamically access the property of a node or relationship using the subscript operator: []

• property replacement = for replacing all properties of a node or relationship

• property mutation operator += for setting specific properties of a node or relationship

Statically accessing a property of a node or relationship using the . operator


Query

CREATE
(a:Person {name: 'Jane', livesIn: 'London'}),
(b:Person {name: 'Tom', livesIn: 'Copenhagen'})
WITH a, b
MATCH (p:Person)
RETURN p.name

Result

p.name

"Jane"

892
p.name

"Tom"

Rows: 2
Nodes created: 2
Properties set: 4
Labels added: 2

Filtering on a dynamically-computed property key using the [] operator


Query

CREATE
(a:Restaurant {name: 'Hungry Jo', rating_hygiene: 10, rating_food: 7}),
(b:Restaurant {name: 'Buttercup Tea Rooms', rating_hygiene: 5, rating_food: 6}),
(c1:Category {name: 'hygiene'}),
(c2:Category {name: 'food'})
WITH a, b, c1, c2
MATCH (restaurant:Restaurant), (category:Category)
WHERE restaurant["rating_" + category.name] > 6
RETURN DISTINCT restaurant.name

Result

restaurant.name

"Hungry Jo"

Rows: 1
Nodes created: 4
Properties set: 8
Labels added: 4

See Basic usage for more details on dynamic property access.

 The behavior of the [] operator with respect to null is detailed here.

Replacing all properties of a node or relationship using the = operator


Query

CREATE (a:Person {name: 'Sofia', age: 20})


WITH a
MATCH (p:Person {name: 'Sofia'})
SET p = {name: 'Ellen', livesIn: 'London'}
RETURN p.name, p.age, p.livesIn

All the existing properties on the node are replaced by those provided in the map; i.e. the name property is
updated from Sofia to Ellen, the age property is deleted, and the livesIn property is added.

Result

p.name p.age p.livesIn

"Ellen" <null> "London"

893
p.name p.age p.livesIn

Rows: 1
Nodes created: 1
Properties set: 5
Labels added: 1

See Replace all properties using a map and = for more details on using the property replacement operator
=.

Mutating specific properties of a node or relationship using the += operator


Query

CREATE (a:Person {name: 'Sofia', age: 20})


WITH a
MATCH (p:Person {name: 'Sofia'})
SET p += {name: 'Ellen', livesIn: 'London'}
RETURN p.name, p.age, p.livesIn

The properties on the node are updated as follows by those provided in the map: the name property is
updated from Sofia to Ellen, the age property is left untouched, and the livesIn property is added.

Result

p.name p.age p.livesIn

"Ellen" 20 "London"

Rows: 1
Nodes created: 1
Properties set: 4
Labels added: 1

See <<set-setting-properties-using-map, Mutate specific properties using a map and =`>> for more
details on using the property mutation operator `=.

Mathematical operators
The mathematical operators comprise:

• addition: +

• subtraction or unary minus: -

• multiplication: *

• division: /

• modulo division: %

• exponentiation: ^

Using the exponentiation operator ^

894
Query

WITH 2 AS number, 3 AS exponent


RETURN number ^ exponent AS result

Result

result

8.0

Rows: 1

Using the unary minus operator -


Query

WITH -3 AS a, 4 AS b
RETURN b - a AS result

Result

result

Rows: 1

Comparison operators
The comparison operators comprise:

• equality: =

• inequality: <>

• less than: <

• greater than: >

• less than or equal to: <=

• greater than or equal to: >=

• IS NULL

• IS NOT NULL

STRING-specific comparison operators comprise:


• STARTS WITH: perform case-sensitive prefix searching on STRING values.

• ENDS WITH: perform case-sensitive suffix searching on STRING values.

• CONTAINS: perform case-sensitive inclusion searching in STRING values.

• =~: regular expression for matching a pattern.

895
Comparing two numbers
Query

WITH 4 AS one, 3 AS two


RETURN one > two AS result

Result

result

true

Rows: 1

See Equality and comparison of values for more details on the behavior of comparison operators, and
Using ranges for more examples showing how these may be used.

Using STARTS WITH to filter names


Query

WITH ['John', 'Mark', 'Jonathan', 'Bill'] AS somenames


UNWIND somenames AS names
WITH names AS candidate
WHERE candidate STARTS WITH 'Jo'
RETURN candidate

Result

candidate

"John"

"Jonathan"

Rows: 2

STRING matching contains more information regarding the STRING-specific comparison operators as well as
additional examples illustrating the usage thereof.

Equality and comparison of values

Equality

Cypher supports comparing values (see Property, structural, and constructed values) by equality using the
= and <> operators.

Values of the same type are only equal if they are the same identical value (e.g. 3 = 3 and "x" <> "xy").

Maps are only equal if they map exactly the same keys to equal values and lists are only equal if they
contain the same sequence of equal values (e.g. [3, 4] = [1+2, 8/2]).

Values of different types are considered as equal according to the following rules:

896
• Paths are treated as lists of alternating nodes and relationships and are equal to all lists that contain
that very same sequence of nodes and relationships.

• Testing any value against null with both the = and the <> operators always evaluates to null. This
includes null = null and null <> null. The only way to reliably test if a value v is null is by using the
special v IS NULL, or v IS NOT NULL, equality operators. v IS NOT NULL is equivalent to NOT(v IS
NULL).

All other combinations of types of values cannot be compared with each other. Especially, nodes,
relationships, and literal maps are incomparable with each other.

It is an error to compare values that cannot be compared.

Ordering and comparison of values


The comparison operators <=, < (for ascending) and >=, > (for descending) are used to compare values for
ordering. The following points give some details on how the comparison is performed.

• Numerical values are compared for ordering using numerical order (e.g. 3 < 4 is true).

• All comparability tests (<, <=, >, >=) with java.lang.Double.NaN evaluate as false. For example, 1 > b
and 1 < b are both false when b is NaN.

• String values are compared for ordering using lexicographic order (e.g. "x" < "xy").

• Boolean values are compared for ordering such that false < true.

• Spatial values cannot be compared using the operators <, <=, >, or >=. To compare spatial values within
a specific range, use either the point.withinBBox() or the point() function.

• Ordering of spatial values:


◦ ORDER BY requires all values to be orderable.

◦ Points are ordered after arrays and before temporal types.

◦ Points of different CRS are ordered by the CRS code (the value of SRID field). For the currently
supported set of Coordinate Reference Systems this means the order: 4326, 4979, 7302, 9157
◦ Points of the same CRS are ordered by each coordinate value in turn, x first, then y and finally z.

◦ Note that this order is different to the order returned by the spatial index, which will be the order of
the space filling curve.

• Comparison of temporal values:


◦ Temporal instant values are comparable within the same type. An instant is considered less than
another instant if it occurs before that instant in time, and it is considered greater than if it occurs
after.
◦ Instant values that occur at the same point in time — but that have a different time zone — are not
considered equal, and must therefore be ordered in some predictable way. Cypher prescribes that,
after the primary order of point in time, instant values be ordered by effective time zone offset,
from west (negative offset from UTC) to east (positive offset from UTC). This has the effect that
times that represent the same point in time will be ordered with the time with the earliest local time
first. If two instant values represent the same point in time, and have the same time zone offset,
but a different named time zone (this is possible for DateTime only, since Time only has an offset),

897
these values are not considered equal, and ordered by the time zone identifier, alphabetically, as its
third ordering component. If the type, point in time, offset, and time zone name are all equal, then
the values are equal, and any difference in order is impossible to observe.
◦ Duration values cannot be compared, since the length of a day, month or year is not known
without knowing which day, month or year it is. Since Duration values are not comparable, the
result of applying a comparison operator between two Duration values is null.

• Ordering of temporal values:


◦ ORDER BY requires all values to be orderable.

◦ Temporal instances are ordered after spatial instances and before strings.

◦ Comparable values should be ordered in the same order as implied by their comparison order.

◦ Temporal instant values are first ordered by type, and then by comparison order within the type.

◦ Since no complete comparison order can be defined for Duration values, we define an order for
ORDER BY specifically for Duration:
▪ Duration values are ordered by normalising all components as if all years were 365.2425 days
long (PT8765H49M12S), all months were 30.436875 (1/12 year) days long (PT730H29M06S), and all
[17]
days were 24 hours long .

• Comparing for ordering when one argument is null (e.g. null < 3 is null).

• Ordering of values with different types:


◦ The ordering is, in ascending order, defined according to the following list:

▪ MAP

▪ NODE

▪ RELATIONSHIP

▪ LIST

▪ PATH

▪ ZONED DATETIME

▪ LOCAL DATETIME

▪ DATE

▪ ZONED TIME

▪ LOCAL TIME

▪ DURATION

▪ STRING

▪ BOOLEAN

▪ Numbers: INTEGER, FLOAT

◦ The value null is ordered after all other values.

• Ordering of constructed type values:


◦ For the constructed types (e.g. maps and lists), elements of the containers are compared pairwise
for ordering and thus determine the ordering of two container types. For example, [1, 'foo', 3] is

898
ordered before [1, 2, 'bar'] since 'foo' is ordered before 2.

Chaining comparison operations


Comparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y AND y <= z.

Formally, if a, b, c, ..., y, z are expressions and op1, op2, ..., opN are comparison operators, then a
op1 b op2 c ... y opN z is equivalent to a op1 b and b op2 c and ... y opN z.

Note that a op1 b op2 c does not imply any kind of comparison between a and c, so that, e.g., x < y > z
is perfectly legal (although perhaps not elegant).

The example:

MATCH (n) WHERE 21 < n.age <= 30 RETURN n

is equivalent to

MATCH (n) WHERE 21 < n.age AND n.age <= 30 RETURN n

Thus, it matches all nodes where the age is between 21 and 30.

This syntax extends to all equality = and inequality <> comparisons, as well as to chains longer than three.

Chains of = and <> are treated in a special way in Cypher.

 This means that 1=1=true is equivalent to 1=1 AND 1=true and not to (1=1)=true or
1=(1=true).

For example:

a < b = c <= d <> e

Is equivalent to:

a < b AND b = c AND c <= d AND d <> e

Using a regular expression with =~ to filter words


Query

WITH ['mouse', 'chair', 'door', 'house'] AS wordlist


UNWIND wordlist AS word
WITH word
WHERE word =~ '.*ous.*'
RETURN word

Result

899
word

"mouse"

"house"

Rows: 2

Further information and examples regarding the use of regular expressions in filtering can be found in
Regular expressions.

Boolean operators
The boolean operators — also known as logical operators — comprise:

• conjunction: AND

• disjunction: OR,

• exclusive disjunction: XOR

• negation: NOT

Here is the truth table for AND, OR, XOR and NOT.

a b a AND b a OR b a XOR b NOT a

false false false false false true

false null false null null true

false true false true true true

true false false true true false

true null null true null false

true true true true false false

null false false null null null

null null null null null null

null true null true null null

Using boolean operators to filter numbers


Query

WITH [2, 4, 7, 9, 12] AS numberlist


UNWIND numberlist AS number
WITH number
WHERE number = 4 OR (number > 6 AND number < 10)
RETURN number

Result

number

900
number

Rows: 3

String operators
The string operators comprise:

• concatenating STRING values: + and ||

• checking if a STRING is normalized: IS NORMALIZED

Concatenating two STRING values with +


Using ` to concatenate strings is functionally equivalent to using `||`. However, the ` string
concatenation operator is not GQL conformant.

Query

RETURN 'neo' + '4j' AS result

Result

result

"neo4j"

Rows: 1

Concatenating two STRING values with || Label—new 5.19


Query

RETURN 'neo' || '4j' AS result

Result

result

"neo4j"

Rows: 1

Checking if a STRING IS NORMALIZED Label—new 5.17


The IS NORMALIZED operator is used to check whether the given STRING is in the NFC Unicode normalization
form:

Unicode normalization is a process that transforms different representations of the same

 string into a standardized form. For more information, see the documentation for
Unicode normalization forms.

901
Query

RETURN "the \u212B char" IS NORMALIZED AS normalized

Result

normalized

false

Because the given STRING contains a non-normalized Unicode character (\u212B), false is returned.

To normalize a STRING, use the normalize() function.

Note that the IS NORMALIZED operator returns null when used on a non-STRING value. For example, RETURN
1 IS NORMALIZED returns null.

Checking if a STRING IS NOT NORMALIZED Label—new 5.17


The IS NOT NORMALIZED operator is used to check whether the given STRING is not in the NFC Unicode
normalization form:

Query

RETURN "the \u212B char" IS NOT NORMALIZED AS notNormalized

Result

notNormalized

true

Because the given STRING contains a non-normalized Unicode character (\u212B), and is not normalized,
true is returned.

To normalize a STRING, use the normalize() function.

Note that the IS NOT NORMALIZED operator returns null when used on a non-STRING value. For example,
RETURN 1 IS NOT NORMALIZED returns null.

Using IS NORMALIZED with a specified normalization type

It is possible to define which Unicode normalization type is used (the default is NFC).

The available normalization types are:

• NFC

• NFD

• NFKC

• NFKD

902
Query

WITH "the \u00E4 char" as myString


RETURN myString IS NFC NORMALIZED AS nfcNormalized,
myString IS NFD NORMALIZED AS nfdNormalized

The given STRING contains the Unicode character: \u00E4, which is considered normalized in NFC form, but
not in NFD form.

Result

nfcNormalized nfdNormalized

true false

Rows: 2

It is also possible to specify the normalization form when using the negated normalization operator. For
example, RETURN "string" IS NOT NFD NORMALIZED.

Temporal operators
Temporal operators comprise:

• adding a DURATION to either a temporal instant or another DURATION: +

• subtracting a DURATION from either a temporal instant or another DURATION: -

• multiplying a DURATION with a number: *

• dividing a DURATION by a number: /

The following table shows — for each combination of operation and operand type — the type of the value
returned from the application of each temporal operator:

Operator Left-hand operand Right-hand operand Type of result

+ Temporal instant DURATION The type of the temporal


instant

+ DURATION Temporal instant The type of the temporal


instant

- Temporal instant DURATION The type of the temporal


instant

+ DURATION DURATION DURATION

- DURATION DURATION DURATION

* DURATION Number DURATION

* Number DURATION DURATION

/ DURATION Number DURATION

903
Adding and subtracting a DURATION to or from a temporal instant
Query

WITH
localdatetime({year:1984, month:10, day:11, hour:12, minute:31, second:14}) AS aDateTime,
duration({years: 12, nanoseconds: 2}) AS aDuration
RETURN aDateTime + aDuration, aDateTime - aDuration

Result

aDateTime + aDuration aDateTime - aDuration

1996-10-11T12:31:14.000000002 1972-10-11T12:31:13.999999998

Rows: 1

Components of a DURATION that do not apply to the temporal instant are ignored. For example, when
adding a DURATION to a DATE, the hours, minutes, seconds and nanoseconds of the DURATION are ignored
(ZONED TIME and LOCAL TIME behaves in an analogous manner):

Query

WITH
date({year:1984, month:10, day:11}) AS aDate,
duration({years: 12, nanoseconds: 2}) AS aDuration
RETURN aDate + aDuration, aDate - aDuration

Result

aDate + aDuration aDate - aDuration

1996-10-11 1972-10-11

Rows: 1

Adding two durations to a temporal instant is not an associative operation. This is because non-existing
dates are truncated to the nearest existing date:

Query

RETURN
(date("2011-01-31") + duration("P1M")) + duration("P12M") AS date1,
date("2011-01-31") + (duration("P1M") + duration("P12M")) AS date2

Result

date1 date2

2012-02-28 2012-02-29

Rows: 1

Adding and subtracting a DURATION to or from another DURATION

904
Query

WITH
duration({years: 12, months: 5, days: 14, hours: 16, minutes: 12, seconds: 70, nanoseconds: 1}) as
duration1,
duration({months:1, days: -14, hours: 16, minutes: -12, seconds: 70}) AS duration2
RETURN duration1, duration2, duration1 + duration2, duration1 - duration2

Result

duration1 duration2 duration1 + duration2 duration1 - duration2

P12Y5M14DT16H13M10.0000000 P1M-14DT15H49M10S P12Y6MT32H2M20.000000001S P12Y4M28DT24M0.000000001S


01S

Rows: 1

Multiplying and dividing a DURATION with or by a number


These operations are interpreted simply as component-wise operations with overflow to smaller units
based on an average length of units in the case of division (and multiplication with fractions).

Query

WITH duration({days: 14, minutes: 12, seconds: 70, nanoseconds: 1}) AS aDuration
RETURN aDuration, aDuration * 2, aDuration / 3

Result

aDuration aDuration * 2 aDuration / 3

P14DT13M10.000000001S P28DT26M20.000000002S P4DT16H4M23.333333333S

Rows: 1

Map operators
The map operators comprise:

• statically access the value of a map by key using the dot operator: .

• dynamically access the value of a map by key using the subscript operator: []

The behavior of the [] operator with respect to null is detailed in the working with null
 page.

Statically accessing the value of a nested map by key using the . operator
Query

WITH {person: {name: 'Anne', age: 25}} AS p


RETURN p.person.name

Result

905
p.person.name

"Anne"

Rows: 1

Dynamically accessing the value of a map by key using the [] operator and a
parameter
A parameter may be used to specify the key of the value to access:

Parameters

{
"myKey" : "name"
}

Query

WITH {name: 'Anne', age: 25} AS a


RETURN a[$myKey] AS result

Result

result

"Anne"

Rows: 1

More information can be found in the Maps chapter.

List operators
The list operators comprise:

• concatenating lists l1 and l2: [l1] + [l2] and [l1] || [l2]

• checking if an element e exists in a list l: e IN [l]

• dynamically accessing an element(s) in a list using the subscript operator: []

 The behavior of the IN and [] operators with respect to null is detailed here.

Concatenating two lists using +


Query

RETURN [1,2,3,4,5] + [6,7] AS myList

Result

906
myList

[1,2,3,4,5,6,7]

Rows: 1

Concatenating two lists using || Label—new 5.19


Query

RETURN [1,2,3,4,5] || [6,7] AS myList

Result

myList

[1,2,3,4,5,6,7]

Rows: 1

Using IN to check if a number is in a list


Query

WITH [2, 3, 4, 5] AS numberlist


UNWIND numberlist AS number
WITH number
WHERE number IN [2, 3, 8]
RETURN number

Result

number

Rows: 2

Using IN for more complex list membership operations


The general rule is that the IN operator will evaluate to true if the list given as the right-hand operand
contains an element which has the same type and contents (or value) as the left-hand operand. Lists are
only comparable to other lists, and elements of a list innerList are compared pairwise in ascending order
from the first element in innerList to the last element in innerList.

The following query checks whether or not the list [2, 1] is an element of the list [1, [2, 1], 3]:

Query

RETURN [2, 1] IN [1, [2, 1], 3] AS inList

The query evaluates to true as the right-hand list contains, as an element, the list [1, 2] which is of the
same type (a list) and contains the same contents (the numbers 2 and 1 in the given order) as the left-hand

907
operand. If the left-hand operator had been [1, 2] instead of [2, 1], the query would have returned
false.

Result

inList

true

Rows: 1

At first glance, the contents of the left-hand operand and the right-hand operand appear to be the same in
the following query:

Query

RETURN [1, 2] IN [1, 2] AS inList

However, IN evaluates to false as the right-hand operand does not contain an element that is of the same
type — i.e. a list — as the left-hand-operand.

Result

inList

false

Rows: 1

The following query can be used to ascertain whether or not a list — obtained from, say, the labels()
function — contains at least one element that is also present in another list:

MATCH (n)
WHERE size([label IN labels(n) WHERE label IN ['Person', 'Employee'] | 1]) > 0
RETURN count(n)

As long as labels(n) returns either Person or Employee (or both), the query will return a value greater than
zero.

Accessing elements in a list using the [] operator


Query

WITH ['Anne', 'John', 'Bill', 'Diane', 'Eve'] AS names


RETURN names[1..3] AS result

The square brackets will extract the elements from the start index 1, and up to (but excluding) the end
index 3.

Result

result

["John","Bill"]

908
result

Rows: 1

Dynamically accessing an element in a list using the [] operator and a parameter


A parameter may be used to specify the index of the element to access:

Parameters

{
"myIndex" : 1
}

Query

WITH ['Anne', 'John', 'Bill', 'Diane', 'Eve'] AS names


RETURN names[$myIndex] AS result

Result

result

"John"

Rows: 1

Using IN with [] on a nested list


IN can be used in conjunction with [] to test whether an element exists in a nested list:

Query

WITH [[1, 2, 3]] AS l


RETURN 3 IN l[0] AS result

Result

result

true

Rows: 1

More details on lists can be found in Lists in general.

Comments
This page describes how to use comments in Cypher.

A single line comment begins with double slash (//) and continues to the end of the line. A multi-line
comment begins with a slash and asterisk (/*) and continues until it ends with an asterisk and a slash ('*/').
Comments do not execute, they are for humans to read.

909
Examples:

MATCH (n) RETURN n //This is an end of line comment

MATCH (n)
//This is a whole line comment
RETURN n

MATCH (n) /* This is a multi line comment,


the comment continues on this line
but it ends on this line. */
RETURN n

MATCH (n) WHERE n.property = '//This is NOT a comment' RETURN n

[17] The 365.2425 days per year comes from the frequency of leap years. A leap year occurs on a year with an ordinal
number divisible by 4, that is not divisible by 100, unless it divisible by 400. This means that over 400 years there are ((365 *
4 + 1) * 25 - 1) * 4 + 1 = 146097 days, which means an average of 365.2425 days per year.

910
Deprecations, additions, and compatibility
Cypher is a language that is constantly evolving. New features are added to the language continuously,
and occasionally, some features become deprecated and are subsequently removed.

This section lists all of the features that have been removed, deprecated, added, or extended in different
Cypher versions. Replacement syntax for deprecated and removed features are also indicated.

Neo4j 5.25

Deprecated features
Feature Details

Functionality Deprecated The CREATE DATABASE option existingDataSeedInstance has


been deprecated and replaced with the option
existingDataSeedServer. The functionality is unchanged.
CREATE DATABASE db OPTIONS {
existingDataSeedInstance: ... }

Updated features
Feature Details

Functionality Updated Neo4j’s GQL’s limit on the maximum length of identifiers.

The maximum limit is set to 16,383 characters in an


CREATE (n:Label {property: 'name'}), identifier. This means that node labels, relationship types,
()-[r:REL_TYPE]->()
and property keys cannot include more than 16,383
characters.

New features
Feature Details

Functionality New The option existingDataSeedServer has been added to


CREATE DATABASE. The functionality is the same as the
deprecated option existingDataSeedServer, which this
CREATE DATABASE db OPTIONS {
existingDataSeedServer: ... } replaces.

Neo4j 5.24

New features

911
Feature Details

Functionality New Introduced OPTIONAL CALL for optionally executing a


procedure or subquery CALL. Similar to OPTIONAL MATCH, any
empty rows produced by the OPTIONAL CALL will return null
MATCH (t:Team)
OPTIONAL CALL (t) { and not affect the remainder of the procedure or subquery
MATCH (p:Player)-[:PLAYS_FOR]->(t) evaluation.
RETURN collect(p) as players
}
RETURN t AS team, players

OPTIONAL CALL db.labels() YIELD label


RETURN label

Functionality New Introduced OFFSET, a GQL conformant synonym to


SKIP.
MATCH (n)
RETURN n.name AS names OFFSET 2 See OFFSET as a synonym for SKIP for details.

Functionality New Introduced GQL conformant standalone ORDER BY,


SKIP/OFFSET, and LIMIT clauses.
MATCH (n)
ORDER BY n.name DESC
OFFSET 3
LIMIT 2
RETURN collect(n.name) AS names

Functionality New Added the ability to dynamically reference labels in SET and
REMOVE clauses.

SET n:$(label)
REMOVE n:$(label)

Functionality New Added the ability to dynamically reference properties in SET


and REMOVE clauses.

SET n[$prop] = "hello world"


REMOVE n[$prop]

Functionality New Added the ability to drop database aliases while deleting a
database. This will affect local database aliases targeting the
database and constituent database aliases belonging to the
DROP [COMPOSITE] DATABASE ... [RESTRICT | CASCADE
ALIAS[ES]] composite database. For more information, see Delete a
composite database with constituent database aliases.

912
Feature Details

Functionality New Extension of the LOAD CSV clause to allow loading CSV files
from Azure Cloud Storage URIs.

LOAD CSV FROM 'azb://azb-account/azb-


container/artists.csv' AS row
MERGE (a:Artist {name: row[1], year: toInteger(
row[2])})
RETURN a.name, a.year

Functionality New Added the ability set which auth providers apply to a user
(Enterprise Edition).

CREATE USER bob Administration of the native (username / password) auth via
SET AUTH 'externalProviderName' {
SET ID 'userIdForExternalProvider' the new syntax is also now supported (Community Edition).
}
SET AUTH 'native' {
SET PASSWORD 'password'
SET PASSWORD CHANGE REQUIRED
}

Functionality New Added the ability add and remove user auth providers via the
ALTER USER command.

ALTER USER bob Setting the native (username / password) auth provider via
REMOVE AUTH 'native'
SET AUTH 'externalProviderName' { this new syntax is also supported (Community Edition), but
SET ID 'userIdForExternalProvider' removing any auth provider or setting a non-native auth
} provider is only supported in Enterprise Edition.

Functionality New New support for WITH AUTH to allow display users'
auth providers with a separate row per user per
SHOW USERS WITH AUTH auth provider.

Functionality New New privilege that allows a user to modify user


auth providers. This is a sub-privilege of the ALTER
SET AUTH USER privilege. Like all GRANT/DENY commands this is
only available in Enterprise Edition.

Neo4j 5.23

Deprecated features

913
Feature Details

Functionality Deprecated Using the WITH clause to import variables to CALL subqueries
is deprecated, and replaced with a variable scope clause. It is
also deprecated to use naked subqueries without a variable
UNWIND [0, 1, 2] AS x
CALL { scope clause.
WITH x
RETURN x * 10 AS y
}
RETURN x, y

Updated features
Feature Details

Functionality Updated Introduced new GQL conformant aliases to duration


types: TIMESTAMP WITHOUT TIME ZONE (alias to LOCAL
RETURN datetime.statement() IS :: TIMESTAMP WITH DATETIME), TIME WITHOUT TIME ZONE (alias to LOCAL
TIME ZONE
TIME), TIMESTAMP WITH TIME ZONE (alias to ZONED
DATETIME), and TIME WITH TIME ZONE (alias to ZONED
TIME).

See types and their synonyms for more.

New features
Feature Details

Functionality New Introduced a new variable scope clause to import variables in


CALL subqueries.

UNWIND [0, 1, 2] AS x
CALL (x) {
RETURN x * 10 AS y
}
RETURN x, y

914
Feature Details

Functionality New Introduced the following configuration settings for


vector indexes:
CREATE VECTOR INDEX moviePlots IF NOT EXISTS
FOR (m:Movie) • vector.quantization.enabled: allows for
ON m.embedding
OPTIONS {indexConfig: { enabling quantization, which can accelerate
`vector.quantization.enabled`: true search performance but can also slightly
`vector.hnsw.m`: 16,
`vector.hnsw.ef_construction`: 100, decrease accuracy.
}}
• vector.hnsw.m: controls the maximum number
of connections each node has in the index’s
internal graph.

• vector.hnsw.ef_construction: sets the number


of nearest neighbors tracked during the
insertion of vectors into the index’s internal
graph.

Additionally, as of Neo4j 5.23, it is no longer


mandatory to configure the settings
vector.dimensions and
vector.similarity_function when creating a
vector index.

Neo4j 5.21

Updated features
Feature Details

Functionality Updated Introduced a deprecatedBy column to SHOW


FUNCTIONS and SHOW PROCEDURES. It is not returned
SHOW FUNCTIONS YIELD * by default in either command.
SHOW PROCEDURES YIELD *
The column is a STRING value specifying a
replacement function/procedure if the used
function/procedure is deprecated. Otherwise, it
returns null.

New features

915
Feature Details

Functionality New Introduction of property-based access control for read


privileges. The ability to read, traverse and match nodes
based on node property values is now supported in
GRANT READ {*} ON GRAPH * FOR (n) WHERE
n.securityLevel > 3 TO regularUsers Enterprise Edition.

GRANT TRAVERSE ON GRAPH * FOR (n:Email) WHERE


n.classification IS NULL TO regularUsers

DENY MATCH {*} ON GRAPH * FOR (n) WHERE


n.classification <> 'UNCLASSIFIED' TO regularUsers

Functionality New Extension of the LOAD CSV clause to allow loading CSV files
from Google Cloud Storage URIs.

LOAD CSV FROM 'gs://gs-bucket/artists.csv' AS row


MERGE (a:Artist {name: row[1], year: toInteger(
row[2])})
RETURN a.name, a.year

Functionality New Introduction of inferSchemaParts, a new Cypher query option


that controls the extent to which the Cypher planner will infer
predicates.
CYPHER inferSchemaParts=most_selective_label

Functionality New Introduction of a lower() and upper() function. These are


aliases of the toLower() and toUpper() functions.

RETURN upper('abc'), lower('ABC')

Functionality New Introduced CALL { … } IN CONCURRENT TRANSACTIONS,


which uses multiple CPU processors simultaneously to
execute batched inner transactions concurrently.
UNWIND range(1, 10) as i
CALL {
WITH i
CREATE (n:N { i: i })
} IN 3 CONCURRENT TRANSACTIONS OF 2 ROWS

916
Feature Details

Functionality New Introduced new graph pattern matching keywords


to find variations of the shortest paths between
MATCH SHORTEST 1 (:A)-[:R]->{0,10}(:B) nodes.

MATCH p = ANY 2 (:A)-[:R]->{0,10}(:B)

MATCH ALL SHORTEST (:A)-[:R]->{0,10}(:B)

MATCH SHORTEST 2 GROUPS (:A)-[:R]->{0,10}(:B)

Functionality New Introduced new operators to solve SHORTEST


queries.
New operators:

• StatefulShortestPath(All)

• StatefulShortestPath(Into)

Neo4j 5.20

Deprecated features
Feature Details

Functionality Deprecated Merging a node or relationship entity, and then


referencing that entity in a property definition in the
MERGE (a {foo:1})-[:T]->(b {foo:a.foo}) same MERGE clause is deprecated. Split the MERGE
clause into two separate clauses instead.

New features
Feature Details

Syntax Functionality New Introduced btrim() function, which returns the given STRING
with leading and trailing trimCharacterString characters
removed. Also extended the trim(), ltrim(), and rtrim()
RETURN trim(BOTH 'x' FROM 'xxhelloxx'),
ltrim('xxhello', 'x'), functions to accept alternative trim character strings.
rtrim('helloxx', 'x'),
btrim('xxhelloxx', 'x')

917
Neo4j 5.19

New features
Feature Details

Functionality New Added a new STRING and LIST concatenation operator.

RETURN "Hello" || " " || "World";

RETURN [1, 2] || [3, 4, 5];

Functionality New New FINISH clause, which can be optionally used to define a
query that returns no result.

FINISH

Functionality New The keyword DISTINCT can now be added after a UNION as
the explicit form of a UNION with duplicate removal.

RETURN 1 AS a
UNION DISTINCT
RETURN 1 AS a

Functionality New Extension of the LOAD CSV clause to allow loading CSV files
from AWS S3 URIs.

LOAD CSV FROM 's3://artists.csv' AS row


MERGE (a:Artist {name: row[1], year: toInteger(
row[2])})
RETURN a.name, a.year

Functionality New Added support for additional Vertex AI vector encoding


models. Also added support for Vertex AI taskType and
• "textembedding-gecko@002" title embedding parameters.

• "textembedding-gecko@003"

• "textembedding-gecko-multilingual@001"

Neo4j 5.18

New features

918
Feature Details

Functionality New Added a new keyword INSERT, which can be used as a


synonym to CREATE for creating nodes and relationships.

INSERT

Functionality New Extension of the simple CASE expression, allowing multiple


matching values to be comma-separated in the same WHEN
statement. The simple CASE uses an implied equals (=)
MATCH (n)
RETURN CASE n.prop comparator, and this extension additionally allows other
WHEN IS NULL THEN "Null" comparison predicates to be explicitly specified before the
WHEN < 0 THEN "Negative" matching value in an extended version of the simple CASE.
WHEN 2, 4, 6, 8 THEN "Even"
ELSE "Odd"
END

Functionality New Added command to create relationship vector indexes. The


index configuration settings vector.dimensions and
vector.similarity_function are mandatory when using this
CREATE VECTOR INDEX [index_name] [IF NOT EXISTS]
FOR ()-[r:REL_TYPE]-() ON (r.property) command. The command allows for the IF NOT EXISTS flag
OPTIONS {indexConfig: { to skip index creation should the index already exist.
`vector.dimensions`: $dimension,
`vector.similarity_function`: $similarityFunction
}}

Functionality New Introduction of vector similarity functions. These functions


return a FLOAT representing the similarity of vectors a and b.

RETURN vector.similarity.euclidean(a, b)
RETURN vector.similarity.cosine(a, b)

Neo4j 5.17

Updated features
Feature Details

Functionality Updated When attempting to create an index using IF NOT EXISTS


with either the same name or same index type and schema,
or both, as an existing index the command now returns a
CREATE [index_type] INDEX [index_name] IF NOT
EXISTS FOR ... notification showing the existing index which blocks the
creation.

Functionality Updated When attempting to create a constraint using IF NOT EXISTS


with either the same name or same constraint type and
schema (and property type for property type constraints), or
CREATE CONSTRAINT [constraint_name] IF NOT EXISTS
FOR ... both, as an existing constraint the command now returns a
notification showing the existing constraint which blocks the
creation.

919
Feature Details

Functionality Updated When attempting to drop a non-existing index using IF


EXISTS the command will now return a notification about the
index not existing.
DROP CONSTRAINT constraint_name IF EXISTS

Functionality Updated When attempting to drop a non-existing constraint using IF


EXISTS the command will now return a notification about the
constraint not existing.
DROP INDEX index_name IF EXISTS

New features
Feature Details

Functionality New Introduction of a normalize() function. This function


normalizes a STRING according to the specified normalization
form, which can be of type NFC, NFD, NFKC, or NFKD.
RETURN normalize("string", NFC)

Functionality New Introduction of an IS NORMALIZED operator. The operator


can be used to check if a STRING is normalized according to
the specified normalization form, which can be of type NFC,
IS [NOT] [NFC | NFD | NFKC | NFKD] NORMALIZED
NFD, NFKC, or NFKD.

RETURN "string" IS NORMALIZED

920
Feature Details

Functionality New Introduction of partitioned operators used by the parallel


runtime. These operators segment the data and operate on
New operators: each segment in parallel

• PartitionedAllNodesScan

• PartitionedDirectedAllRelationshipsScan

• PartitionedDirectedRelationshipIndexScan

• PartitionedDirectedRelationshipIndexSeek

• PartitionedDirectedRelationshipIndexSeekBy
Range
• PartitionedDirectedUnionRelationshipTypesS
can
• PartitionedNodeByLabelScan

• PartitionedNodeIndexScan

• PartitionedNodeIndexSeek

• PartitionedNodeIndexSeekByRange

• PartitionedUndirectedAllRelationshipsScan

• PartitionedUndirectedRelationshipIndexScan

• PartitionedUndirectedRelationshipIndexSeek

• PartitionedUndirectedRelationshipIndexSeek
ByRange
• PartitionedUndirectedRelationshipTypeScan

• PartitionedUndirectedUnionRelationshipType
sScan
• PartitionedUnionNodeByLabelsScan

• PartitionedUnwind

Neo4j 5.16

Updated features
Feature Details

Functionality Updated Added the ability to use parameters for the index name in the
CREATE and DROP commands.

CREATE [index_type] INDEX $name [IF NOT EXISTS]


FOR ...

DROP INDEX $name [IF EXISTS]

921
Feature Details

Functionality Updated Added the ability to use parameters for the constraint name
in the CREATE and DROP commands.

CREATE CONSTRAINT $name [IF NOT EXISTS] FOR ...

DROP CONSTRAINT $name [IF EXISTS]

New features
Feature Details

Functionality New Added the ability to grant or deny LOAD privilege on a CIDR
range. For more information, see the Operations Manual →
The CIDR privilege.
GRANT LOAD ON CIDR "127.0.0.1/32" TO role
DENY LOAD ON CIDR "::1/128" TO role

Neo4j 5.15

Deprecated features
Feature Details

Functionality Deprecated The Unicode character `\u0085` is deprecated for


unescaped identifiers and will be considered as a
RETURN 1 as my\u0085identifier whitespace character in the future. To continue
using it, escape the identifier by adding backticks
around the identifier. This applies to all unescaped
identifiers in Cypher, such as label expressions,
properties, variable names or parameters. In the
given example, the quoted identifier would be
`my�identifier`.

922
Feature Details

Functionality Deprecated The character with the Unicode representation


`\u0024` is deprecated for unescaped identifiers
RETURN 1 as my$Identifier and will not be supported in the future. To continue
using it, escape the identifier by adding backticks
around the identifier. This applies to all unescaped
identifiers in Cypher, such as label expressions,
properties, variable names or parameters. In the
given example, the quoted identifier would be
`my$identifier`.

The following Unicode Characters are deprecated in


identifiers: '\u0000', '\u0001', '\u0002', '\u0003',
'\u0004', '\u0005', '\u0006', '\u0007', '\u0008',
'\u000E', '\u000F', '\u0010', '\u0011', '\u0012',
'\u0013', '\u0014', '\u0015', '\u0016', '\u0017',
'\u0018', '\u0019', '\u001A', '\u001B', '\u007F',
'\u0080', '\u0081', '\u0082', '\u0083', '\u0084',
'\u0086', '\u0087', '\u0088', '\u0089', '\u008A',
'\u008B', '\u008C', '\u008D', '\u008E', '\u008F',
'\u0090', '\u0091', '\u0092', '\u0093', '\u0094',
'\u0095', '\u0096', '\u0097', '\u0098', '\u0099',
'\u009A', '\u009B', '\u009C', '\u009D', '\u009E',
'\u009F', '\u0024', '\u00A2', '\u00A3', '\u00A4',
'\u00A5', '\u00AD', '\u0600', '\u0601', '\u0602',
'\u0603', '\u0604', '\u0605', '\u061C', '\u06DD',
'\u070F', '\u08E2', '\u180E', '\u200B', '\u200C',
'\u200D', '\u200E', '\u200F', '\u202A', '\u202B',
'\u202C', '\u202D', '\u202E', '\u2060', '\u2061',
'\u2062', '\u2063', '\u2064', '\u2066', '\u2067',
'\u2068', '\u2069', '\u206A', '\u206B', '\u206C',
'\u206D', '\u206E', '\u206F', '\u2E2F', '\uFEFF',
'\uFFF9', '\uFFFA', '\uFFFB'

Updated features
Feature Details

Functionality Updated Extended SHOW INDEXES with easy filtering for vector indexes.
This is equivalent to SHOW INDEXES WHERE type = 'VECTOR'.

SHOW VECTOR INDEXES

923
Feature Details

Functionality Updated IS :: STRING NOT NULL is now an index-compatible


predicate.

MATCH (n:Label) WHERE $param IS :: STRING NOT NULL


AND n.prop = $param

New features
Feature Details

Functionality New Added a new keyword ALL, explicitly defining that the
aggregate function is not DISTINCT. This is a mirror of the
already existing keyword DISTINCT for functions.
MATCH (n)
RETURN count(ALL n.prop)

Functionality New Added command to create node vector indexes, replacing the
db.index.vector.createNodeIndex procedure. The index
configuration settings vector.dimensions and
CREATE VECTOR INDEX [index_name] [IF NOT EXISTS]
FOR (n: Label) ON (n.property) vector.similarity_function are mandatory when using this
OPTIONS {indexConfig: { command. The command allows for the IF NOT EXISTS flag
`vector.dimensions`: $dimension, to skip index creation should the index already exist.
`vector.similarity_function`: $similarityFunction
}}

Neo4j 5.14

Updated features
Feature Details

Functionality Updated Extended type syntax to allow an exclamation mark ! as a


synonym for NOT NULL.

IS :: INTEGER!

New features
Feature Details

Functionality New Introduction of a nullIf() function. This function returns null if


the two given parameters are equivalent, otherwise returns
the value of the first parameter.
RETURN nullIf(v1, v2)

924
Feature Details

Functionality New Added a new keyword NODETACH, explicitly


defining that relationships will not be detached and
MATCH (n) NODETACH DELETE n deleted. This is a mirror of the already existing
keyword DETACH.

Neo4j 5.13

Updated features
Feature Details

Functionality Updated Updated the signatures column in SHOW FUNCTIONS


and SHOW PROCEDURES.
SHOW FUNCTIONS YIELD *
SHOW PROCEDURES YIELD * Procedure signatures now follow the pattern:
"procedureName(param1 :: TYPE, param2 :: TYPE,
.., paramN :: TYPE) :: (returnParam1 :: TYPE,
returnParam2, .., returnParamN :: TYPE)"

The signature for procedures with no return


columns now follows the pattern:
"procedureName(param1 :: TYPE, param2 :: TYPE,
.., paramN :: TYPE)"

Function signatures now follow the pattern:


"functionName(param1 :: TYPE, param2 :: TYPE,
.., paramN :: TYPE) :: TYPE"

For all available Cypher types, see the section on


types and their synonyms.

New features
Feature Details

Functionality New Beta Introduction of the Change Data Capture (CDC) feature. For
details, see Change Data Capture.

CALL cdc.current()
CALL cdc.earliest()
CALL cdc.query(from, selectors)

Functionality New Introduction of a valueType() function. This function returns a


STRING representation of the most precise value type that the
given expression evaluates to.
RETURN valueType(expr)

925
Feature Details

Functionality New Introduction of a char_length() function. This function returns


the number of Unicode characters in a STRING. It is an alias of
the size() function.
RETURN char_length(expr)

Functionality New Introduction of a character_length() function. This function


returns the number of Unicode characters in a STRING. It is an
alias of the size() function.
RETURN character_length(expr)

Functionality New New privilege that controls a user’s ability to load data.
Unlike other privileges, these are not granted, denied, or
New privilege: revoked on graphs, databases, or the DBMS, but instead on
ALL DATA.

GRANT LOAD ON ALL DATA TO `role`

Functionality New New graph function, graph.byElementId(), that resolves the


constituent graph to which a given element id belongs.

USE graph.byElementId(elementId :: STRING)

Functionality New Introduction of the parallel runtime. This runtime is designed


for analytical, graph-global read queries run on machines
with several available CPUs.
CYPHER runtime = parallel

Neo4j 5.12

New features
Feature Details

Functionality New New database function to return database names


from element ids.
db.nameFromElementId(elementId :: STRING) ::
STRING

Neo4j 5.11

Updated features

926
Feature Details

Functionality Updated Introduced a new column composite to SHOW


ALIASES. This column is returned by default.
SHOW ALIASES
The column returns the name of the composite
database that the alias belongs to, or null if the
alias does not belong to a composite database.

Functionality Updated Extended type predicate expressions. Closed


dynamic union types (type1 | type2 | …) are now
IS [NOT] :: <TYPE> supported. For example, the following query which
evaluates to true if a value is either of type INTEGER
or FLOAT:

IS :: INTEGER | FLOAT

Functionality Updated Extended node and relationship property type


constraints. Closed dynamic union types (type1 |
CREATE CONSTRAINT name FOR (n:Label) REQUIRE type2 | …) are now supported, allowing for types
n.prop IS :: <PROPERTY TYPE> such as:
CREATE CONSTRAINT name FOR ()-[r:TYPE]-() REQUIRE
r.prop IS :: <PROPERTY TYPE> • INTEGER | FLOAT

• LIST<STRING NOT NULL> | STRING

• ZONED DATETIME | LOCAL DATETIME

Functionality Updated This command now auto-commits even when


executed inside an explicit transaction.
ALTER CURRENT USER
SET PASSWORD FROM 'password1' TO 'password2'

Neo4j 5.10

Updated features

927
Feature Details

Functionality Updated Extended type predicate expressions. The newly


supported types are:
IS [NOT] :: <TYPE>
• NOTHING

• NULL

• BOOLEAN NOT NULL

• STRING NOT NULL

• INTEGER NOT NULL

• FLOAT NOT NULL

• DATE NOT NULL

• LOCAL TIME NOT NULL

• ZONED TIME NOT NULL

• LOCAL DATETIME NOT NULL

• ZONED DATETIME NOT NULL

• DURATION NOT NULL

• POINT NOT NULL

• NODE

• NODE NOT NULL

• RELATIONSHIP

• RELATIONSHIP NOT NULL

• MAP

• MAP NOT NULL

• LIST<TYPE>

• LIST<TYPE> NOT NULL

• PATH

• PATH NOT NULL

• PROPERTY VALUE

• PROPERTY VALUE NOT NULL

• ANY

• ANY NOT NULL

928
Feature Details

Functionality Updated Extended node and relationship property type


constraints. The new supported types are:
CREATE CONSTRAINT name FOR (n:Label) REQUIRE
n.prop IS :: <PROPERTY TYPE> • LIST<BOOLEAN NOT NULL>
CREATE CONSTRAINT name FOR ()-[r:TYPE]-() REQUIRE • LIST<STRING NOT NULL>
r.prop IS :: <PROPERTY TYPE>
• LIST<INTEGER NOT NULL>

• LIST<FLOAT NOT NULL>

• LIST<DATE NOT NULL>

• LIST<LOCAL TIME NOT NULL>

• LIST<ZONED TIME NOT NULL>

• LIST<LOCAL DATETIME NOT NULL>

• LIST<ZONED DATETIME NOT NULL>

• LIST<DURATION NOT NULL>

• LIST<POINT NOT NULL>

Neo4j 5.9

Deprecated features
Feature Details

Functionality Deprecated Creating a node or relationship entity, and then


referencing that entity in a property definition in the
CREATE (a {foo:1}), (b {foo:a.foo}) same CREATE clause is deprecated. Split the CREATE
clause into two separate clauses instead.

Updated features
Feature Details

Functionality Updated Introduced an isDeprecated column to SHOW


SETTINGS, SHOW FUNCTIONS, and SHOW PROCEDURES. It
SHOW SETTINGS YIELD * is not returned by default in either command.
SHOW FUNCTIONS YIELD *
SHOW PROCEDURES YIELD *
The column is true if the setting/function/procedure
is deprecated and false otherwise.

929
Feature Details

Functionality Updated Introduced an isDeprecated field to the argument


and return description maps for SHOW FUNCTIONS
SHOW FUNCTIONS YIELD argumentDescription and SHOW PROCEDURES.
SHOW PROCEDURES YIELD argumentDescription,
returnDescription
The field is true if the argument/return value is
deprecated and false otherwise.

Functionality Updated Introduced propertyType column, which is returned


by default. It returns a STRING representation of the
SHOW CONSTRAINTS property type for property type constraints, and
null for other constraints.

New features
Feature Details

Functionality New Introduction of quantified path patterns - a new method in


graph pattern matching for matching paths of a variable
length. More information can be found here.
MATCH ((x:A)-[:R]->(z:B WHERE z.h > x.h)){1,5}

Functionality New The Repeat(Trail) operator is used to solve


quantified path patterns. More information can be
New operator: Repeat(Trail) found here.

Functionality New Added type predicate expressions. The available


types are:
IS [NOT] :: <TYPE>
• BOOLEAN

• STRING

• INTEGER

• FLOAT

• DATE

• LOCAL TIME

• ZONED TIME

• LOCAL DATETIME

• ZONED DATETIME

• DURATION

• POINT

930
Feature Details

Functionality New Added node and relationship property type


constraints. The available property types are:
CREATE CONSTRAINT name FOR (n:Label) REQUIRE
n.prop IS :: <PROPERTY TYPE> • BOOLEAN
CREATE CONSTRAINT name FOR ()-[r:TYPE]-() REQUIRE • STRING
r.prop IS :: <PROPERTY TYPE>
• INTEGER

• FLOAT

• DATE

• LOCAL TIME

• ZONED TIME

• LOCAL DATETIME

• ZONED DATETIME

• DURATION

• POINT

Functionality New Added filtering for the new property constraints to


SHOW CONSTRAINTS. Includes filtering for the node
SHOW NODE PROPERTY TYPE CONSTRAINTS part, relationship part, or both parts.

SHOW REL[ATIONSHIP] PROPERTY TYPE CONSTRAINTS

SHOW PROPERTY TYPE CONSTRAINTS

Functionality New List supported privileges on the current server.

SHOW SUPPORTED PRIVILEGE[S]

Neo4j 5.8

Updated features

931
Feature Details

Functionality Updated Introduced lastRead, readCount, and trackedSince


columns. Both lastRead and readCount are returned
SHOW INDEXES by default.

The lastRead column returns the last time the index


was used for reading. The readCount column
returns the number of read queries that have been
issued to this index. The trackedSince column
returns the time when usage statistics tracking
started for this index.

New features
Feature Details

Functionality New The AssertSameRelationship operator is used to


ensure that no relationship property uniqueness
New operator: AssertSameRelationship constraints are violated in the slotted and
interpreted runtime. More information can be found
here.

Neo4j 5.7

Deprecated features
Feature Details

Functionality Deprecated The Cypher query option


connectComponentsPlanner is deprecated and will
CYPHER connectComponentsPlanner=greedy MATCH (a), be removed without a replacement. The product’s
(b) RETURN *
default behavior of using a cost-based IDP search
algorithm when combining sub-plans will be kept.
CYPHER connectComponentsPlanner=idp MATCH (a), (b)
RETURN *

Updated features

932
Feature Details

Functionality Updated New sub-clause WAIT for ALTER DATABASE. This


enables adding a waiting clause to specify a time
ALTER DATABASE ... [WAIT [n [SEC[OND[S]]]]|NOWAIT] limit in which the command must be completed and
returned.

Functionality New Added relationship key and property uniqueness


constraints.
CREATE CONSTRAINT name FOR ()-[r:TYPE]-() REQUIRE
r.prop IS UNIQUE

CREATE CONSTRAINT name FOR ()-[r:TYPE]-() REQUIRE


r.prop IS RELATIONSHIP KEY

Functionality New Added filtering for the new constraint types to SHOW
CONSTRAINTS. Includes filtering for the node part,
SHOW NODE UNIQUE[NESS] CONSTRAINTS relationship part, or both parts of each type (NODE
KEY filtering already exists previously).
SHOW REL[ATIONSHIP] UNIQUE[NESS] CONSTRAINTS

SHOW UNIQUE[NESS] CONSTRAINTS The existing UNIQUENESS filter will now return both
SHOW REL[ATIONSHIP] KEY CONSTRAINTS node and relationship property uniqueness
SHOW KEY CONSTRAINTS constraints.

New features
Feature Details

Functionality New New fine-grained control mechanism to control


how an inner transaction impacts subsequent inner
CALL { and/or outer transactions.
<inner>
} IN TRANSACTIONS [ OF <num> ROWS ]
[ ON ERROR CONTINUE / BREAK / FAIL ] • ON ERROR CONTINUE - will ignore errors and
[ REPORT STATUS AS <v> ] continue with the execution of subsequent inner
transactions when one of them fails.

• ON ERROR BREAK - will ignore an error and stop


the execution of subsequent inner transactions.

• ON ERROR FAIL - will fail in case of an error.

• REPORT STATUS AS <v> - reports the execution


status of the inner transaction (a map value
including the fields started committed,
transactionId, and errorMessage). This flag is
disallowed for ON ERROR FAIL.

933
Neo4j 5.6

New features
Feature Details

Functionality New New functionality to change tags at runtime via


ALTER SERVER. More information can be found in the
server.tag
Operations Manual → ALTER SERVER options.

Functionality New New expression which returns the results of a


subquery collected in a list.
COLLECT {
...
}

Functionality New List configuration settings on the current server.

The setting-name is either a comma-separated list


SHOW SETTING[S] [setting-name[,...]]
[YIELD { * | field[, ...] } [ORDER BY field[, of one or more quoted STRING values or a single
...]] [SKIP n] [LIMIT n]]
[WHERE expression] expression resolving to a STRING or a
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP LIST<STRING>`.
n] [LIMIT n]]

Functionality New New privilege that controls a user’s access to


New privilege: desired configuration settings.

SHOW SETTING[S] name-globbing[,...]

Neo4j 5.5

Deprecated features
Feature Details

Functionality Deprecated Using differently ordered return items in a UNION


[ALL] clause is deprecated. Replaced by:
RETURN 'val' as one, 'val' as two
UNION
RETURN 'val' as two, 'val' as one RETURN 'val' as one, 'val' as two
UNION
RETURN 'val' as one, 'val' as two

RETURN 'val' as one, 'val' as two


UNION ALL
RETURN 'val' as two, 'val' as one RETURN 'val' as one, 'val' as two
UNION ALL
RETURN 'val' as one, 'val' as two

934
New features
Feature Details

Functionality New The IntersectionNodeByLabelsScan operator


fetches all nodes that have all of the provided labels
New operator: IntersectionNodeByLabelsScan from the node label index. More information can be
found here.

Neo4j 5.3

Updated features
Feature Details

Functionality Updated Changes to the visibility of databases hosted on


offline servers.
SHOW DATABASES
For such databases:

• The address column will return NULL.

• The currentStatus column will return unknown.

• The statusMessage will return Server is


unavailable.

Functionality Updated An EXISTS subquery now supports any non-writing


query. For example, it now supports UNION and CALL
EXISTS { clauses.
...
}

Functionality Updated A COUNT subquery now supports any non-writing


query. For example, it now supports UNION and CALL
COUNT { clauses.
...
}

Functionality Updated The property uniqueness constraint type filter now


allow both UNIQUE and UNIQUENESS keywords.
SHOW UNIQUE[NESS] CONSTRAINTS

New features

935
Feature Details

Functionality New The NodeByElementIdSeek operator reads one or


more nodes by ID from the node store, specified via
New operator: NodeByElementIdSeek the function elementId(). More information can be
found here.

Neo4j 5.2

Updated features
Feature Details

Functionality Updated Creating composite databases now allows for an


empty options clause. There are no applicable
CREATE COMPOSITE DATABASE name OPTIONS {} option values for composite databases.

Functionality New To preview of the result of either REALLOCATE or


DEALLOCATE without executing, prepend the
DRYRUN REALLOCATE|DEALLOCATE DATABASES FROM command with DRYRUN.
<serverId>

Neo4j 5.1

Deprecated features
Feature Details

Functionality Deprecated The text index provider text-1.0 is deprecated and


replaced by text-2.0.
CREATE TEXT INDEX ... OPTIONS {indexProvider:
`text-1.0`}

Updated features
Feature Details

Functionality Updated A new text index provider is available, text-2.0.


This is also the default provider if none is given.
CREATE TEXT INDEX ... OPTIONS {indexProvider:
`text-2.0`}

936
Neo4j 5.0

Removed features
Feature Details

Functionality Removed Replaced by:

SHOW EXISTS CONSTRAINTS SHOW [PROPERTY] EXIST[ENCE] CONSTRAINTS

SHOW NODE EXISTS CONSTRAINTS SHOW NODE [PROPERTY] EXIST[ENCE] CONSTRAINTS

SHOW RELATIONSHIP EXISTS CONSTRAINTS SHOW REL[ATIONSHIP] [PROPERTY] EXIST[ENCE]


CONSTRAINTS

Functionality Removed Replaced by:

SHOW INDEXES BRIEF SHOW INDEXES

SHOW CONSTRAINTS BRIEF SHOW CONSTRAINTS

Functionality Removed Replaced by:

SHOW INDEXES VERBOSE SHOW INDEXES YIELD *

SHOW CONSTRAINTS VERBOSE SHOW CONSTRAINTS YIELD *

Functionality Removed Replaced by:

DROP INDEX ON :Label(prop) DROP INDEX name

937
Feature Details

Functionality Removed Replaced by:

DROP CONSTRAINT ON (n:Label) ASSERT (n.prop) IS DROP CONSTRAINT name


NODE KEY

DROP CONSTRAINT ON (n:Label) ASSERT (n.prop) IS


UNIQUE

DROP CONSTRAINT ON (n:Label) ASSERT exists(n.prop)

DROP CONSTRAINT ON ()-[r:Type]-() ASSERT exists


(r.prop)

Functionality Removed Replaced by:

CREATE INDEX ON :Label(prop) CREATE INDEX FOR (n:Label) ON (n.prop)

Functionality Removed Replaced by:

CREATE CONSTRAINT ON ... ASSERT ... CREATE CONSTRAINT FOR ... REQUIRE ...

Functionality Removed B-tree indexes are removed.

B-tree indexes used for STRING predicates are


CREATE BTREE INDEX ...
replaced by:

Functionality Removed
CREATE TEXT INDEX ...

CREATE INDEX
... B-tree indexes used for spatial queries are replaced
OPTIONS "{" btree-option: btree-value[, ...] "}"
by:

CREATE POINT INDEX ...

B-tree indexes used for general queries or property


value types are replaced by:

CREATE [RANGE] INDEX ...

These new indexes may be combined for multiple


use cases.

938
Feature Details

Functionality Removed B-tree indexes are removed.

Replaced by:
SHOW BTREE INDEXES

SHOW {POINT | RANGE | TEXT} INDEXES

Functionality Removed B-tree indexes are removed.

Replaced by:
USING BTREE INDEXES

USING {POINT | RANGE | TEXT} INDEX

Functionality Removed Node key and property uniqueness constraints


backed by B-tree indexes are removed.
CREATE CONSTRAINT
... Replaced by:
OPTIONS "{" btree-option: btree-value[, ...] "}"

CREATE CONSTRAINT ...

Constraints used for STRING properties require an


additional text index to cover the STRING predicates
properly. Constraints used for point properties
require an additional point index to cover the spatial
queries properly.

Functionality Removed The uniqueness output has been removed along


with the concept of index uniqueness, as it actually
SHOW INDEXES YIELD uniqueness belongs to the constraint and not the index.

The new column owningConstraint was introduced


to indicate whether an index belongs to a constraint
or not.

Functionality Removed The ownedIndexId output has been removed and


replaced by the new ownedIndex column.
SHOW CONSTRAINTS YIELD ownedIndexId

Functionality Removed Replaced by:


For privilege commands:
ON HOME DATABASE
ON DEFAULT DATABASE

939
Feature Details

Functionality Removed Replaced by:


For privilege commands:
ON HOME GRAPH
ON DEFAULT GRAPH

Functionality Removed The allocatedBytes output has been removed,


because it was never tracked and thus was always
SHOW TRANSACTIONS YIELD allocatedBytes 0.

Functionality Removed Replaced by:

exists(prop) prop IS NOT NULL

Functionality Removed Replaced by:

NOT exists(prop) prop IS NULL

Functionality Removed Replaced by 0o....

0...

Functionality Removed Only 0x... (lowercase x) is supported.

0X...

Functionality Removed Remaining support for repeated relationship


variables is removed.
MATCH ()-[r]-()
RETURN [ ()-[r]-()-[r]-() | r ] AS rs

Functionality Removed Automatic coercion of a list to a boolean is


removed.
WHERE [1,2,3]
Replaced by:

WHERE NOT isEmpty([1, 2, 3])

940
Feature Details

Functionality Removed Replaced by:

distance(n.prop, point({x:0, y:0}) point.distance(n.prop, point({x:0, y:0})

Functionality Removed The ability to use operators <, <=, >, or >= on spatial
points is removed. Instead, use:
point({x:0, y:0}) <= point({x:1, y:1}) <= point({
x:2, y:2})
point.withinBBox(point({x:1, y:1}), point({x:0, y
:0}), point({x:2, y:2}))

Functionality Removed Replaced by:

USING PERIODIC COMMIT ... CALL {


...
} IN TRANSACTIONS

Functionality Removed It is no longer allowed to have CREATE clauses in


which a variable introduced in the pattern is also
CREATE (a {prop:7})-[r:R]->(b {prop: a.prop}) referenced from the same pattern.

Functionality Removed Unaliased expressions are no longer supported in


subquery RETURN clauses. Replaced by:
CALL { RETURN 1 }
CALL { RETURN 1 AS one }

Functionality Removed Pattern expressions producing lists of paths are no


longer supported, but they can still be used as
MATCH (a) RETURN (a)--() existence predicates, for example in WHERE clauses.
Instead, use a pattern comprehension:

MATCH (a) RETURN [p=(a)--() | p]

941
Feature Details

Functionality Removed Implied grouping keys are no longer supported.


Only expressions that do not contain aggregations
MATCH (n) RETURN n.propertyName_1, are still considered grouping keys. In expressions
n.propertyName_2 + count(*)
that contain aggregations, the leaves must be
either:

• An aggregation

• A literal

• A parameter

• A variable, ONLY IF it is either: 1) A projection


expression on its own (e.g. the n in RETURN n AS
myNode, n.value + count(*))
2) A local variable in the expression (e.g the x in
RETURN n, n.prop + size([ x IN range(1,
10) | x ])

• Property access, ONLY IF it is also a projection


expression on its own (e.g. the n.prop in RETURN
n.prop, n.prop + count(*))

• Map access, ONLY IF it is also a projection


expression on its own (e.g. the map.prop in WITH
{prop: 2} AS map RETURN map.prop, map.prop
+ count(*))

Deprecated features
Feature Details

Functionality Deprecated Use the properties() function instead to get the


map of properties of nodes/relationships that can
MATCH (n)-[r:REL]->(m) SET n=r then be used in a SET clause:

MATCH (n)-[r:REL]->(m) SET n=properties(r)

Functionality Deprecated shortestPath and allShortestPaths without


variable-length relationship are deprecated. Instead,
MATCH (a), (b), allShortestPaths((a)-[r]->(b)) use a MATCH with a LIMIT of 1 or:
RETURN b

MATCH (a), (b), shortestPath((a)-[r]->(b)) RETURN MATCH (a), (b), shortestPath((a)-[r*1..1]->(b))


b RETURN b

942
Feature Details

Functionality Deprecated Creating a database with unescaped dots in the


name has been deprecated, instead escape the
CREATE DATABASE databaseName.withDot ... database name:

CREATE DATABASE `databaseName.withDot` ...

Functionality Deprecated Replaced by:

()-[:A|:B]->() ()-[:A|B]->()

Updated features
Feature Details

Functionality Updated The default index type is changed from B-tree to


range index.
CREATE INDEX ...

Functionality Updated The new column owningConstraint was added and


will be returned by default from now on. It will list
SHOW INDEXES the name of the constraint that the index is
associated with or null, in case it is not associated
with any constraint.

Functionality Updated The new column ownedIndex was added and will be
returned by default from now on. It will list the
SHOW CONSTRAINTS name of the index associated with the constraint or
null, in case no index is associated with it.

943
Feature Details

Functionality Updated New columns for the current query are added:

• currentQueryStartTime
SHOW TRANSACTIONS YIELD *
• currentQueryStatus

• currentQueryActiveLockCount

• currentQueryElapsedTime

• currentQueryCpuTime

• currentQueryWaitTime

• currentQueryIdleTime

• currentQueryAllocatedBytes

• currentQueryPageHits

• currentQueryPageFaults

These columns are only returned in the full set (with


YIELD) and not by default.

Functionality Updated Terminate transaction now allows YIELD. The WHERE


clause is not allowed on its own, as it is for SHOW,
TERMINATE TRANSACTIONS transaction-id[,...] but needs the YIELD clause.
YIELD { * | field[, ...] }
[ORDER BY field[, ...]]
[SKIP n]
[LIMIT n]
[WHERE expression]
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP
n] [LIMIT n]]

Functionality Updated transaction-id now allows general expressions


resolving to a STRING or LIST<STRING> instead of
SHOW TRANSACTIONS [transaction-id[,...]] just parameters.

TERMINATE TRANSACTIONS transaction-id[,...]

944
Feature Details

Functionality Updated The SHOW and TERMINATE TRANSACTIONS commands


can be combined in the same query. The query does
SHOW TRANSACTIONS [transaction-id[,...]] not require a specific order and there can be zero or
YIELD field[, ...] more of each command type, however at least one
[ORDER BY field[, ...]]
[SKIP n] command is needed.
[LIMIT n]
[WHERE expression]
TERMINATE TRANSACTIONS transaction-id[,...] When the command is not in standalone mode, the
YIELD field[, ...]
[ORDER BY field[, ...]]
YIELD and RETURN clauses are mandatory. YIELD * is
[SKIP n] not allowed.
[LIMIT n]
[WHERE expression]
RETURN field[, ...] transaction-id is a comma-separated list of one or
[ORDER BY field[, ...]]
[SKIP n] more quoted STRING values. It could also be an
[LIMIT n] expression resolving to a STRING or a LIST<STRING>
(for example the output column from SHOW).

Functionality Updated Not a syntax change but a semantic one. The


EXECUTE BOOSTED privilege will no longer include an
GRANT EXECUTE BOOSTED PROCEDURE ... implicit EXECUTE privilege when granted. That
GRANT EXECUTE BOOSTED FUNCTION ...
means that to execute a procedure or a function
with boosted privileges both EXECUTE and EXECUTE
BOOSTED are needed.

Functionality Updated Privileges can be specified as IMMUTABLE, which


means that they cannot be altered by users with
[GRANT|DENY] [IMMUTABLE] ... Privilege Management. They can only be
administered with auth disabled.

Functionality Updated IMMUTABLE can now be specified with the REVOKE


command to specify that only immutable privileges
REVOKE [IMMUTABLE] ... should be revoked.

945
Feature Details

Functionality Updated Changes to the default columns in the result:

• The writer, type, and constituents columns


SHOW DATABASES
have been added.

• The values returned in the role column have


changes to be just primary, secondary, or
unknown.

• The error column has been renamed to


statusMessage.

The following columns have been added to the full


result set (with YIELD) and not by default:

• creationTime

• lastStartTime

• lastStopTime

• store

• currentPrimariesCount

• currentSecondariesCount

• requestedPrimariesCount

• requestedSecondariesCount

Functionality Updated Previously, if n.prop is null, 'one' would be


returned. Now, 'two' is returned.
MATCH (n)
RETURN This is a semantic change only. Since null = null
CASE n.prop
WHEN null THEN 'one' returns false in Cypher, a WHEN expression no
ELSE 'two' longer matches on null.
END

If matching on null is required, please use IS NULL


instead:

MATCH (n)
RETURN
CASE
WHEN n.prop IS NULL THEN 'one'
ELSE 'two'
END

946
Feature Details

Functionality Updated Rounding infinity and NaN values will now return
the original value instead of returning an integer
RETURN round(val, precision) approximation for precision 0 and throwing an
exception for precision > 0:

old value new value

round(Inf) 92233720368547 Inf


76000.0

round(Inf, 1) exception Inf

round(NaN) 0 NaN

round(Inf, 1) exception NaN

To get an integer value use the toInteger function.

Functionality Updated The alias commands can now handle aliases in


composite databases.
CREATE [OR REPLACE] ALIAS
compositeDatabase.aliasName ...
ALTER ALIAS compositeDatabase.aliasName
DROP ALIAS compositeDatabase.aliasName

Functionality Updated SHOW ALIAS now allows for easy filtering on alias
name.
SHOW ALIAS[ES] aliasName FOR DATABASE[S]
SHOW ALIAS[ES] compositeDatabase.aliasName FOR
DATABASE[S]

Functionality Updated The alias commands can now handle aliases in


composite databases.
CREATE [OR REPLACE] ALIAS
compositeDatabase.aliasName ...
ALTER ALIAS compositeDatabase.aliasName
DROP ALIAS compositeDatabase.aliasName

Functionality Updated SHOW ALIAS now allows for easy filtering on alias
name.
SHOW ALIAS[ES] aliasName FOR DATABASE[S]
SHOW ALIAS[ES] compositeDatabase.aliasName FOR
DATABASE[S]

New features

947
Feature Details

Functionality New New Cypher command for creating and dropping


composite databases.
CREATE [OR REPLACE] COMPOSITE DATABASE
databaseName [IF NOT EXISTS] [WAIT [n [SEC[OND[
S]]]]|NOWAIT]
DROP COMPOSITE DATABASE databaseName [IF EXISTS]
[DUMP DATA | DESTROY DATA] [WAIT [n [SEC[OND[S
]]]]|NOWAIT]

Functionality New New privileges that allow a user to CREATE and/or


New privilege: DROP composite databases.

CREATE COMPOSITE DATABASE


DROP COMPOSITE DATABASE
COMPOSITE DATABASE MANAGEMENT

Functionality New Cypher now supports number literals with


underscores between digits.
1_000_000, 0x_FF_FF, 0o_88_88

Functionality New New function which returns whether the given


number is NaN. NaN is a special floating point number
isNaN(n.prop) defined in the Floating-Point Standard IEEE 754.
This function was introduced since comparisons
including NaN = NaN returns false.

Functionality New Cypher now supports float literals for the values
Infinity and NaN. NaN defines a quiet not-a-number
NaN, Inf, Infinity value and does not throw any exceptions in
arithmetic operations. Both values are implemented
according to the Floating-Point Standard IEEE 754.

Functionality New New expression which returns the number of


results of a subquery.
COUNT { (n) WHERE n.foo = "bar" }

Functionality New New sub-clause for CREATE DATABASE, to specify the


number of servers hosting a database, when
CREATE DATABASE ... TOPOLOGY n PRIMAR{Y|IES} [m creating a database in cluster environments.
SECONDAR{Y|IES}]

948
Feature Details

Functionality New New sub-clause for ALTER DATABASE, which allows


modifying the number of servers hosting a
ALTER DATABASE ... SET TOPOLOGY n PRIMAR{Y|IES} [m database in cluster environments.
SECONDAR{Y|IES}]

Functionality New New Cypher command for enabling servers.

ENABLE SERVER ...

Functionality New New Cypher command for setting options for a


server.
ALTER SERVER ... SET OPTIONS ...

Functionality New New Cypher command for changing the name of a


server.
RENAME SERVER ... TO ...

Functionality New New Cypher command for re-balancing what


servers host which databases.
REALLOCATE DATABASES

Functionality New New Cypher command for moving all databases


from servers.
DEALLOCATE DATABASE[S] FROM SERVER[S] ...

Functionality New New Cypher command for dropping servers.

DROP SERVER ...

Functionality New New Cypher command for listing servers.

SHOW SERVERS

949
Feature Details

Functionality New New privileges that allow a user to create, modify,


New privileges: reallocate, deallocate, drop and list servers.

SERVER MANAGEMENT

SHOW SERVERS

Functionality New New concise syntax for expressing predicates for


which labels a node may have, referred to as label
MATCH (n: A&(B|C)&!D) expression.

Functionality New New concise syntax for expressing predicates for


which relationship types a relationship may have,
MATCH ()-[r:(!A&!B)]->() referred to as relationship type expression.

Functionality New New syntax that enables inlining of WHERE clauses


inside relationship patterns.
MATCH ()-[r:R {prop1: 42} WHERE r.prop2 > 42]->()

Neo4j 4.4

Deprecated features

950
Feature Details

Functionality Deprecated Implied grouping keys are deprecated. Only


expressions that do not contain aggregations are
MATCH (n) RETURN n.propertyName_1, still considered grouping keys. In expressions that
n.propertyName_2 + count(*)
contain aggregations, the leaves must be either:

• An aggregation

• A literal

• A parameter

• A variable, ONLY IF it is either:


1) A projection expression on its own (e.g. the n
in RETURN n AS myNode, n.value + count(*))
2) A local variable in the expression (e.g the x in
RETURN n, n.prop + size([ x IN range(1,
10) | x ])

• Property access, ONLY IF it is also a projection


expression on its own (e.g. the n.prop in RETURN
n.prop, n.prop + count(*))

• Map access, ONLY IF it is also a projection


expression on its own (e.g. the map.prop in WITH
{prop: 2} AS map RETURN map.prop, map.prop
+ count(*))

Syntax Deprecated Replaced by:

USING PERIODIC COMMIT ... CALL {


...
} IN TRANSACTIONS

Syntax Deprecated CREATE clauses in which a variable introduced in the


pattern is also referenced from the same pattern are
CREATE (a {prop:7})-[r:R]->(b {prop: a.prop}) deprecated.

Syntax Deprecated Replaced by:

CREATE CONSTRAINT ON ... ASSERT ... CREATE CONSTRAINT FOR ... REQUIRE ...

951
Feature Details

Functionality Deprecated B-tree indexes are deprecated.

B-tree indexes used for string queries are replaced


CREATE BTREE INDEX ...
by:

CREATE TEXT INDEX ...

B-tree indexes used for spatial queries are replaced


by:

CREATE POINT INDEX ...


Functionality Deprecated

CREATE INDEX B-tree indexes used for general queries or property


... value types are replaced by:
OPTIONS "{" btree-option: btree-value[, ...] "}"

CREATE RANGE INDEX ...

These new indexes may be combined for multiple


use cases.

Functionality Deprecated B-tree indexes are deprecated.

Replaced by:
SHOW BTREE INDEXES

SHOW {POINT | RANGE | TEXT} INDEXES

Functionality Deprecated B-tree indexes are deprecated.

Replaced by:
USING BTREE INDEX

USING {POINT | RANGE | TEXT} INDEX

952
Feature Details

Functionality Deprecated Node key and property uniqueness constraints with


B-tree options are deprecated.
CREATE CONSTRAINT
... Replaced by:
OPTIONS "{" btree-option: btree-value[, ...] "}"

CREATE CONSTRAINT
...
OPTIONS "{" range-option: range-value[, ...] "}"

Constraints used for string properties will also


require an additional text index to cover the string
queries properly. Constraints used for point
properties will also require an additional point index
to cover the spatial queries properly.

Functionality Deprecated Replaced by:

distance(n.prop, point({x:0, y:0}) point.distance(n.prop, point({x:0, y:0})

Functionality Deprecated The ability to use the inequality operators <, ⇐, >,
and >= on spatial points is deprecated. Instead, use:
point({x:0, y:0}) <= point({x:1, y:1}) <= point({
x:2, y:2})
point.withinBBox(point({x:1, y:1}), point({x:0, y
:0}), point({x:2, y:2}))

Functionality Deprecated Currently, if n.prop is null, 'one' would be


returned. Since null = null returns false in
MATCH (n) Cypher, a WHEN expression will no longer match in
RETURN future versions.
CASE n.prop
WHEN null THEN 'one'
ELSE 'two' Please use IS NULL instead:
END

MATCH (n)
RETURN
CASE
WHEN n.prop IS NULL THEN 'one'
ELSE 'two'
END

New features

953
Feature Details

Functionality New New clause for evaluating a subquery in separate


transactions. Typically used when modifying or
CALL { importing large amounts of data. See CALL { ... } IN
... TRANSACTIONS.
} IN TRANSACTIONS

Syntax New New syntax for creating constraints, applicable to


all constraint types.
CREATE CONSTRAINT FOR ... REQUIRE ...

Functionality New Property uniqueness constraints now allow multiple


properties, ensuring that the combination of
CREATE CONSTRAINT [constraint_name] [IF NOT property values are unique.
EXISTS]
FOR (n:LabelName)
REQUIRE (n.propertyName_1, …, n.propertyName_n) IS
UNIQUE
[OPTIONS "{" option: value[, ...] "}"]

Functionality New Deprecated Property uniqueness constraints now allow multiple


properties.
DROP CONSTRAINT
ON (n:LabelName) Replaced by:
ASSERT (n.propertyName_1, …, n.propertyName_n) IS
UNIQUE
DROP CONSTRAINT name [IF EXISTS]

Syntax New Existence constraints now allow an OPTIONS map,


however, at this point there are no available values
CREATE CONSTRAINT [constraint_name] [IF NOT for the map.
EXISTS]
FOR ...
REQUIRE ... IS NOT NULL
OPTIONS "{" "}"

Functionality New Token lookup indexes now allow an OPTIONS map to


specify the index provider.
CREATE LOOKUP INDEX [index_name] [IF NOT EXISTS]
FOR ... ON ...
OPTIONS "{" option: value[, ...] "}"

Functionality New Allows creating text indexes on nodes or


relationships with a particular label or relationship
CREATE TEXT INDEX ... type, and property combination. They can be
dropped by using their name.

954
Feature Details

Functionality New Allows creating range indexes on nodes or


relationships with a particular label or relationship
CREATE RANGE INDEX ... type, and properties combination. They can be
dropped by using their name.

Functionality New Allows creating node key and property uniqueness


constraints backed by range indexes by providing
CREATE CONSTRAINT the range index provider in the OPTIONS map.
...
OPTIONS "{" indexProvider: 'range-1.0' "}"

Functionality New Allows creating point indexes on nodes or


relationships with a particular label or relationship
CREATE POINT INDEX ... type, and property combination. They can be
dropped by using their name.

Syntax New New privilege that allows a user to assume


New privilege: privileges of another one.

IMPERSONATE

Functionality New List transactions on the current server.

The transaction-id is a comma-separated list of


SHOW TRANSACTION[S] [transaction-id[,...]]
[YIELD { * | field[, ...] } [ORDER BY field[, one or more quoted STRING values, a STRING
...]] [SKIP n] [LIMIT n]]
[WHERE expression] parameter, or a list parameter.
[RETURN field[, ...] [ORDER BY field[, ...]] [SKIP
n] [LIMIT n]]
This replaces the procedures
dbms.listTransactions and dbms.listQueries.

Functionality New Terminate transactions on the current server.

The transaction-id is a comma-separated list of


TERMINATE TRANSACTION[S] transaction-id[,...]
one or more quoted STRING values, a STRING
parameter, or a list parameter.

This replaces the procedures


dbms.killTransaction, dbms.killTransactions,
dbms.killQuery, and dbms.killQueries.

955
Feature Details

Functionality New New Cypher command for modifying a database by


changing its access mode.
ALTER DATABASE ... [IF EXISTS]
SET ACCESS {READ ONLY | READ WRITE}

Functionality New New privilege that allows a user to modify


New privilege: databases.

ALTER DATABASE

Functionality New New privilege that allows a user to modify database


New privilege: access mode.

SET DATABASE ACCESS

Functionality New New Cypher command for creating an alias for a


database name. Remote aliases are only supported
CREATE ALIAS ... [IF NOT EXISTS] from Neo4j 4.4.8.
FOR DATABASE ...

Functionality New New Cypher command for creating or replacing an


alias for a database name. Remote aliases are only
CREATE OR REPLACE ALIAS ... supported from Neo4j 4.4.8.
FOR DATABASE ...

Functionality New New Cypher command for altering an alias. Remote


aliases are only supported from Neo4j 4.4.8.
ALTER ALIAS ... [IF EXISTS]
SET DATABASE ...

Functionality New New Cypher command for dropping a database


alias.
DROP ALIAS ... [IF EXISTS] FOR DATABASE

Functionality New New Cypher command for listing database aliases.


Only supported since Neo4j 4.4.8.
SHOW ALIASES FOR DATABASE

956
Feature Details

Functionality New New privilege that allows a user to create, modify,


New privilege: delete and list aliases. Only supported since Neo4j
4.4.8.
ALIAS MANAGEMENT

Functionality New New privilege that allows a user to create aliases.


New privilege: Only supported since Neo4j 4.4.8.

CREATE ALIAS

Functionality New New privilege that allows a user to modify aliases.


New privilege: Only supported since Neo4j 4.4.8.

ALTER ALIAS

Functionality New New privilege that allows a user to delete aliases.


New privilege: Only supported since Neo4j 4.4.8.

DROP ALIAS

Functionality New New privilege that allows a user to show aliases.


New privilege: Only supported since Neo4j 4.4.8.

SHOW ALIAS

Syntax New New syntax that enables inlining of WHERE clauses


inside node patterns.
MATCH (n:N {prop1: 42} WHERE n.prop2 > 42)

Neo4j 4.3

Deprecated features

957
Feature Details

Syntax Deprecated Replaced by:

CREATE CONSTRAINT [name] CREATE CONSTRAINT [name]


ON (node:Label) ON (node:Label)
ASSERT exists(node.property) ASSERT node.property IS NOT NULL

Syntax Deprecated Replaced by:

CREATE CONSTRAINT [name] CREATE CONSTRAINT [name]


ON ()-[rel:REL]-() ON ()-[rel:REL]-()
ASSERT exists(rel.property) ASSERT rel.property IS NOT NULL

Syntax Deprecated Replaced by:

exists(prop) prop IS NOT NULL

Syntax Deprecated Replaced by:

NOT exists(prop) prop IS NULL

Syntax Deprecated Replaced by default output columns.


BRIEF [OUTPUT] for SHOW INDEXES and SHOW
CONSTRAINTS.

Syntax Deprecated Replaced by:


VERBOSE [OUTPUT] for SHOW INDEXES and SHOW
CONSTRAINTS. YIELD *

Syntax Deprecated Replaced by:

SHOW EXISTS CONSTRAINTS SHOW [PROPERTY] EXIST[ENCE] CONSTRAINTS

Still allows BRIEF and VERBOSE but not YIELD or


WHERE.

Syntax Deprecated Replaced by:

SHOW NODE EXISTS CONSTRAINTS SHOW NODE [PROPERTY] EXIST[ENCE] CONSTRAINTS

Still allows BRIEF and VERBOSE but not YIELD or


WHERE.

958
Feature Details

Syntax Deprecated Replaced by:

SHOW RELATIONSHIP EXISTS CONSTRAINTS SHOW RELATIONSHIP [PROPERTY] EXIST[ENCE]


CONSTRAINTS

Still allows BRIEF and VERBOSE but not YIELD or


WHERE.

Syntax Deprecated Replaced by:


For privilege commands:
ON HOME DATABASE
ON DEFAULT DATABASE

Syntax Deprecated Replaced by:


For privilege commands:
ON HOME GRAPH
ON DEFAULT GRAPH

Syntax Deprecated Pattern expressions producing lists of paths are


deprecated, but they can still be used as existence
MATCH (a) RETURN (a)--() predicates, for example in WHERE clauses. Instead,
use a pattern comprehension:

MATCH (a) RETURN [p=(a)--() | p]

Updated features
Feature Details

Functionality Updated Now allows filtering for:

SHOW INDEXES WHERE ... SHOW INDEXES

Functionality Updated Now allows filtering for:

SHOW CONSTRAINTS WHERE ... SHOW CONSTRAINTS

959
Feature Details

Functionality Updated Now allows YIELD, WHERE, and RETURN clauses to


SHOW INDEXES to change the output.
SHOW INDEXES YIELD ...
[WHERE ...]
[RETURN ...]

Functionality Updated Now allows YIELD, WHERE, and RETURN clauses to


SHOW CONSTRAINTS to change the output.
SHOW CONSTRAINTS YIELD ...
[WHERE ...]
[RETURN ...]

Syntax Updated New syntax for filtering SHOW CONSTRAINTS on


property existence constraints.
SHOW [PROPERTY] EXIST[ENCE] CONSTRAINTS Allows YIELD and WHERE but not BRIEF or VERBOSE.

Syntax Updated New syntax for filtering SHOW CONSTRAINTS on node


property existence constraints.
SHOW NODE [PROPERTY] EXIST[ENCE] CONSTRAINTS Allows YIELD and WHERE but not BRIEF or VERBOSE.

Syntax Updated New syntax for filtering SHOW CONSTRAINTS on


relationship property existence constraints.
SHOW REL[ATIONSHIP] [PROPERTY] EXIST[ENCE] Allows YIELD and WHERE but not BRIEF or VERBOSE.
CONSTRAINTS

Functionality Updated Now allows easy filtering for SHOW INDEXES on


fulltext indexes.
SHOW FULLTEXT INDEXES Allows YIELD and WHERE but not BRIEF or VERBOSE.

Functionality Updated Now allows easy filtering for SHOW INDEXES on token
lookup indexes.
SHOW LOOKUP INDEXES Allows YIELD and WHERE but not BRIEF or VERBOSE.

New features

960
Feature Details

Syntax New New syntax to pass options to CREATE DATABASE.


This can be used to specify a specific cluster node
CREATE DATABASE ... to seed data from.
[OPTIONS {...}]

Syntax New New syntax for creating node property existence


constraints.
CREATE CONSTRAINT [name]
ON (node:Label)
ASSERT node.property IS NOT NULL

Syntax New New syntax for creating relationship property


existence constraints.
CREATE CONSTRAINT [name]
ON ()-[rel:REL]-()
ASSERT rel.property IS NOT NULL

Syntax New Makes altering users idempotent. If the specified


name does not exists, no error is thrown.
ALTER USER name IF EXISTS ...

Syntax New Now allows setting home database for user.

ALTER USER ...


SET HOME DATABASE ...

Syntax New Now allows removing home database for user.

ALTER USER ...


REMOVE HOME DATABASE

Syntax New CREATE USER now allows setting home database for
user.
CREATE USER ...
SET HOME DATABASE ...

Syntax New New syntax for showing the home database of the
current user.
SHOW HOME DATABASE

961
Feature Details

Syntax New New Cypher command for administering privilege


New privilege: for changing users home database.

SET USER HOME DATABASE

Syntax New New syntax for privileges affecting home database.


For privilege commands:

ON HOME DATABASE

Syntax New New syntax for privileges affecting home graph.


For privilege commands:

ON HOME GRAPH

Syntax New Allows creating fulltext indexes on nodes or


relationships. They can be dropped by using their
CREATE FULLTEXT INDEX ... name.

Functionality New Allows creating indexes on relationships with a


particular relationship type and property
CREATE INDEX FOR ()-[r:TYPE]-() ... combination. They can be dropped by using their
name.

Functionality New Create token lookup index for nodes with any labels
or relationships with any relationship type. They can
CREATE LOOKUP INDEX ... be dropped by using their name.

Functionality New New Cypher command for changing the name of a


role.
RENAME ROLE

Functionality New New Cypher command for changing the name of a


user.
RENAME USER

962
Feature Details

Functionality New New Cypher commands for listing procedures.

SHOW PROCEDURE[S]
[EXECUTABLE [BY {CURRENT USER | username}]]
[YIELD ...]
[WHERE ...]
[RETURN ...]

Functionality New New Cypher commands for listing functions.

SHOW [ALL | BUILT IN | USER DEFINED] FUNCTION[S]


[EXECUTABLE [BY {CURRENT USER | username}]]
[YIELD ...]
[WHERE ...]
[RETURN ...]

Neo4j 4.2

Deprecated features
Feature Details

Syntax Deprecated Replaced by 0o....

0...

Syntax Deprecated Only 0x... (lowercase x) is supported.

0X...

Syntax Deprecated Unaliased expressions are deprecated in subquery


RETURN clauses. Replaced by:
CALL { RETURN 1 }
CALL { RETURN 1 AS one }

Updated features
Feature Details

Functionality Updated Can now handle multiple roles.

SHOW ROLE name PRIVILEGES SHOW ROLES n1, n2, ... PRIVILEGES

963
Feature Details

Functionality Updated Can now handle multiple users.

SHOW USER name PRIVILEGES SHOW USERS n1, n2, ... PRIVILEGES

Functionality Updated The round() function can now take an additional


argument to specify rounding precision.
round(expression, precision)

Functionality Updated The round() function can now take two additional
arguments to specify rounding precision and
round(expression, precision, mode) rounding mode.

New features
Feature Details

Functionality New Privileges can now be shown as Cypher commands.

SHOW PRIVILEGES [AS [REVOKE] COMMAND[S]]

Syntax New New optional part of the Cypher commands for


database privileges.
DEFAULT GRAPH

Syntax New Cypher now interprets literals with prefix 0o as an


octal integer literal.
0o...

Syntax New For CREATE USER and ALTER USER, it is now possible
to set (or update) a password when the plaintext
SET [PLAINTEXT | ENCRYPTED] PASSWORD password is unknown, but the encrypted password
is available.

Functionality New New Cypher commands for administering privileges


New privilege: for executing procedures and user defined
functions. See The DBMS EXECUTE privileges.
EXECUTE

964
Feature Details

Syntax New Allows setting index provider and index


configuration when creating an index.
CREATE [BTREE] INDEX ... [OPTIONS {...}]

Syntax New Allows setting index provider and index


configuration for the backing index when creating a
CREATE CONSTRAINT ... IS NODE KEY [OPTIONS {...}] node key constraint.

Syntax New Allows setting index provider and index


configuration for the backing index when creating a
CREATE CONSTRAINT ... IS UNIQUE [OPTIONS {...}] property uniqueness constraint.

Syntax New New Cypher command for showing current logged-


in user and roles.
SHOW CURRENT USER

Functionality New New Cypher commands for listing indexes.

Replaces the procedures db.indexes,


SHOW [ALL | BTREE] INDEX[ES] [BRIEF | VERBOSE
[OUTPUT]] db.indexDetails (verbose), and partially
db.schemaStatements (verbose).

Functionality New New Cypher commands for listing constraints.

Replaces the procedures db.constraints and


SHOW [ALL | UNIQUE | NODE EXIST[S] | RELATIONSHIP
EXIST[S] | EXIST[S] | NODE KEY] CONSTRAINT[S] partially db.schemaStatements (verbose).
[BRIEF | VERBOSE [OUTPUT]]

Functionality New New Cypher command for administering privilege


New privilege: for listing indexes.

SHOW INDEX

Functionality New New Cypher command for administering privilege


New privilege: for listing constraints.

SHOW CONSTRAINTS

965
Neo4j 4.1.3

New features
Feature Details

Syntax New Makes index creation idempotent. If an index with


the name or schema already exists no error will be
CREATE INDEX [name] IF NOT EXISTS FOR ... thrown.

Syntax New Makes index deletion idempotent. If no index with


the name exists no error will be thrown.
DROP INDEX name IF EXISTS

Syntax New Makes constraint creation idempotent. If a


constraint with the name or type and schema
CREATE CONSTRAINT [name] IF NOT EXISTS ON ... already exists no error will be thrown.

Syntax New Makes constraint deletion idempotent. If no


constraint with the name exists no error will be
DROP CONSTRAINT name IF EXISTS thrown.

Neo4j 4.1

Restricted features
Feature Details

Functionality Restricted No longer revokes sub-privileges when revoking a


compound privilege, e.g. when revoking INDEX
REVOKE ... MANAGEMENT, any CREATE INDEX and DROP INDEX
privileges will no longer be revoked.

Functionality Restricted No longer includes the privileges START DATABASE


and STOP DATABASE.
ALL DATABASE PRIVILEGES

Updated features

966
Feature Details

Procedure Updated The queryId procedure format has changed, and no


longer includes the database name. For example,
queryId mydb-query-123 is now query-123. This change
affects built-in procedures dbms.listQueries(),
dbms.listActiveLocks(queryId),
dbms.killQueries(queryIds) and
dbms.killQuery(queryId).

Functionality Updated The returned privileges are a closer match to the


original grants and denies, e.g. if granted MATCH the
SHOW PRIVILEGES command will show that specific privilege and not
the TRAVERSE and READ privileges. Added support for
YIELD and WHERE clauses to allow filtering results.

New features
Feature Details

Functionality New The PUBLIC role is automatically assigned to all


New role: users, giving them a set of base privileges.

PUBLIC

Syntax New The MATCH privilege can now be revoked.


For privileges:

REVOKE MATCH

Functionality New New support for YIELD and WHERE clauses to allow
filtering results.
SHOW USERS

Functionality New New support for YIELD and WHERE clauses to allow
filtering results.
SHOW ROLES

Functionality New New support for YIELD and WHERE clauses to allow
filtering results.
SHOW DATABASES

967
Feature Details

Functionality New New Cypher commands for administering


TRANSACTION MANAGEMENT privileges transaction management.

Functionality New New Cypher commands for administering user


DBMS USER MANAGEMENT privileges management.

Functionality New New Cypher commands for administering database


DBMS DATABASE MANAGEMENT privileges management.

Functionality New New Cypher commands for administering privilege


DBMS PRIVILEGE MANAGEMENT privileges management.

Functionality New New Cypher command for administering role, user,


database and privilege management.
ALL DBMS PRIVILEGES

Functionality New New Cypher command for administering read and


write privileges.
ALL GRAPH PRIVILEGES

Functionality New New Cypher commands for administering write


Write privileges privileges.

Functionality New New optional part of the Cypher commands for


database privileges.
ON DEFAULT DATABASE

Neo4j 4.0

Removed features
Feature Details

Function Removed Replaced by relationships().

rels()

968
Feature Details

Function Removed Replaced by toInteger().

toInt()

Function Removed Replaced by toLower().

lower()

Function Removed Replaced by toUpper().

upper()

Function Removed Replaced by list comprehension.

extract()

Function Removed Replaced by list comprehension.

filter()

Functionality Removed The RULE planner was removed in 3.2, but still
For Rule planner: possible to trigger using START or CREATE UNIQUE
clauses. Now it is completely removed.
CYPHER planner=rule

Functionality Removed The removal of the RULE planner in 3.2 was the
Explicit indexes beginning of the end for explicit indexes. Now they
are completely removed, including the removal of
the built-in procedures for Neo4j 3.3 to 3.5.

Functionality Removed Replaced by the new pipelined runtime which


For compiled runtime: covers a much wider range of queries.

CYPHER runtime=compiled

Clause Removed Running queries with this clause will cause a syntax
error.
CREATE UNIQUE

969
Feature Details

Clause Removed Running queries with this clause will cause a syntax
error.
START

Syntax Removed Replaced by MATCH (n)-[:A|B|C {foo: 'bar'}]-()


RETURN n.
MATCH (n)-[:A|:B|:C {foo: 'bar'}]-() RETURN n

Syntax Removed Replaced by MATCH (n)-[x:A|B|C]-() RETURN n.

MATCH (n)-[x:A|:B|:C]-() RETURN n

Syntax Removed Replaced by MATCH (n)-[x:A|B|C*]-() RETURN n.

MATCH (n)-[x:A|:B|:C*]-() RETURN n

Syntax Removed Replaced by $parameter.

{parameter}

Deprecated features
Feature Details

Syntax Deprecated As in Cypher 3.2, this is replaced by:

MATCH (n)-[rs*]-() RETURN rs MATCH p=(n)-[*]-() RETURN relationships(p) AS rs

Syntax Deprecated Replaced by CREATE INDEX FOR (n:Label) ON


(n.prop).
CREATE INDEX ON :Label(prop)

Syntax Deprecated Replaced by DROP INDEX name.

DROP INDEX ON :Label(prop)

970
Feature Details

Syntax Deprecated Replaced by DROP CONSTRAINT name.

DROP CONSTRAINT ON (n:Label) ASSERT (n.prop) IS


NODE KEY

Syntax Deprecated Replaced by DROP CONSTRAINT name.

DROP CONSTRAINT ON (n:Label) ASSERT (n.prop) IS


UNIQUE

Syntax Deprecated Replaced by DROP CONSTRAINT name.

DROP CONSTRAINT ON (n:Label) ASSERT exists(n.prop)

Syntax Deprecated Replaced by DROP CONSTRAINT name.

DROP CONSTRAINT ON ()-[r:Type]-() ASSERT exists


(r.prop)

Restricted features
Feature Details

Function Restricted Restricted to only work on paths. See length() for


more details.
length()

Function Restricted Only works for strings, lists and pattern


comprehensions, and no longer works for paths. For
size() versions above 5.0, use a COUNT expression instead:

RETURN COUNT { (a)-[]->(b) }

For versions below 5.0, use a pattern


comprehension instead:

RETURN size([ (a)-[]->(b) | a ])

See size() and Count Subqueries for more details.

971
Updated features
Feature Details

Syntax Extended The create constraint syntax can now include a


name.
CREATE CONSTRAINT [name] ON ...
The IS NODE KEY and IS UNIQUE versions of this
command replace the procedures db.createNodeKey
and db.createUniquePropertyConstraint,
respectively.

New features
Feature Details

Functionality New This Neo4j Enterprise Edition only feature involves a


Pipelined runtime: new runtime that has many performance
enhancements.
CYPHER runtime=pipelined

Functionality New New Cypher commands for administering multiple


Multi-database administration databases.

Functionality New New Cypher commands for administering role-


Access control based access control.

Functionality New New Cypher commands for administering dbms,


Fine-grained security database, graph and sub-graph access control.

Syntax New New syntax for creating indexes, which can include
a name.
CREATE INDEX [name] FOR (n:Label) ON (n.prop)
Replaces the db.createIndex procedure.

Syntax New New command for dropping an index by name.

DROP INDEX name

Syntax New New command for dropping a constraint by name,


no matter the type.
DROP CONSTRAINT name

972
Feature Details

Clause New EXISTS subqueries are subclauses used to filter the


results of a MATCH, OPTIONAL MATCH, or WITH clause.
WHERE EXISTS {...}

Clause New New clause to specify which graph a query, or


query part, is executed against.
USE neo4j

Neo4j 3.5

Deprecated features
Feature Details

Functionality Deprecated The compiled runtime will be discontinued in the


Compiled runtime: next major release. It might still be used for default
queries in order to not cause regressions, but
CYPHER runtime=compiled explicitly requesting it will not be possible.

Function Deprecated Replaced by list comprehension.

extract()

Function Deprecated Replaced by list comprehension.

filter()

Neo4j 3.4
Feature Type Change Details

Spatial point types Functionality Amendment A point — irrespective of


which Coordinate Reference
System is used — can be
stored as a property and is
able to be backed by an
index. Prior to this, a point
was a virtual property only.

point() - Cartesian 3D Function Added

point() - WGS 84 3D Function Added

973
Feature Type Change Details

randomUUID() Function Added

Temporal types Functionality Added Supports storing, indexing


and working with the
following temporal types:
Date, Time, LocalTime,
DateTime, LocalDateTime
and Duration.

Temporal functions Functionality Added Functions allowing for the


creation and manipulation of
values for each temporal
type — Date, Time,
LocalTime, DateTime,
LocalDateTime and Duration.

Temporal operators Functionality Added Operators allowing for the


manipulation of values for
each temporal type — Date,
Time, LocalTime, DateTime,
LocalDateTime and Duration.

toString() Function Extended Now also allows temporal


values as input (i.e. values of
type Date, Time, LocalTime,
DateTime, LocalDateTime or
Duration).

Neo4j 3.3
Feature Type Change Details

START Clause Removed As in Cypher 3.2, any queries


using the START clause will
revert back to Cypher 3.1
planner=rule. However, there
are built-in procedures for
Neo4j versions 3.3 to 3.5 for
accessing explicit indexes.
The procedures will enable
users to use the current
version of Cypher and the
cost planner together with
these indexes. An example of
this is CALL
db.index.explicit.searchNo
des('my_index','email:me*'
).

CYPHER runtime=slotted Functionality Added Neo4j Enterprise Edition only


(Faster interpreted runtime)

974
Feature Type Change Details

max(), min() Function Extended Now also supports


aggregation over sets
containing lists of strings
and/or numbers, as well as
over sets containing strings,
numbers, and lists of strings
and/or numbers

Neo4j 3.2
Feature Type Change Details

CYPHER planner=rule (Rule Functionality Removed All queries now use the cost
planner) planner. Any query
prepended thus will fall back
to using Cypher 3.1.

CREATE UNIQUE Clause Removed Running such queries will fall


back to using Cypher 3.1 (and
use the rule planner)

START Clause Removed Running such queries will fall


back to using Cypher 3.1 (and
use the rule planner)

MATCH (n)-[rs*]-() RETURN Syntax Deprecated Replaced by MATCH p=(n)-


rs [*]-() RETURN
relationships(p) AS rs

MATCH (n)-[:A|:B|:C {foo: Syntax Deprecated Replaced by MATCH (n)-


'bar'}]-() RETURN n [:A|B|C {foo: 'bar'}]-()
RETURN n

MATCH (n)-[x:A|:B|:C]-() Syntax Deprecated Replaced by MATCH (n)-


RETURN n [x:A|B|C]-() RETURN n

MATCH (n)-[x:A|:B|:C*]-() Syntax Deprecated Replaced by MATCH (n)-


RETURN n [x:A|B|C*]-() RETURN n

User-defined aggregation Functionality Added


functions

Composite indexes Index Added

Node Key Index Added Neo4j Enterprise Edition only

CYPHER runtime=compiled Functionality Added Neo4j Enterprise Edition only


(Compiled runtime)

reverse() Function Extended Now also allows a list as


input

max(), min() Function Extended Now also supports


aggregation over a set
containing both strings and
numbers

975
Neo4j 3.1
Feature Type Change Details

rels() Function Deprecated Replaced by relationships()

toInt() Function Deprecated Replaced by toInteger()

lower() Function Deprecated Replaced by toLower()

upper() Function Deprecated Replaced by toUpper()

toBoolean() Function Added

Map projection Syntax Added

Pattern comprehension Syntax Added

User-defined functions Functionality Added

CALL...YIELD...WHERE Clause Extended Records returned by YIELD


may be filtered further using
WHERE

Neo4j 3.0
Feature Type Change Details

has() Function Removed Replaced by exists()

str() Function Removed Replaced by toString()

{parameter} Syntax Deprecated Replaced by $parameter

properties() Function Added

CALL [...YIELD\] Clause Added

point() - Cartesian 2D Function Added

point() - WGS 84 2D Function Added

distance() Function Added

User-defined procedures Functionality Added

toString() Function Extended Now also allows Boolean


values as input

976
Appendix
Appendix A: Cypher styleguide
The purpose of the Cypher styleguide is to make queries as easy to read as possible.

For rules and recommendations for naming of labels, relationship types and properties, please see the
Naming rules and recommendations.

General recommendations
• When using Cypher language constructs in prose, use a monospaced font and follow the styling rules.

• When referring to labels and relationship types, the colon should be included as follows: :Label,
:REL_TYPE.

• When referring to functions, use lower camel case and parentheses. For example: toString().

• If you are storing Cypher statements in a separate file, use the file extension .cypher.

Indentation and line breaks


• Start a new clause on a new line.

Bad

MATCH (n) WHERE n.name CONTAINS 's' RETURN n.name

Good

MATCH (n)
WHERE n.name CONTAINS 's'
RETURN n.name

• Indent ON CREATE and ON MATCH with two spaces. Put ON CREATE before ON MATCH if both are present.

Bad

MERGE (n) ON CREATE SET n.prop = 0


MERGE (a:A)-[:T]->(b:B)
ON MATCH SET b.name = 'you'
ON CREATE SET a.name = 'me'
RETURN a.prop

Good

MERGE (n)
ON CREATE SET n.prop = 0
MERGE (a:A)-[:T]->(b:B)
ON CREATE SET a.name = 'me'
ON MATCH SET b.name = 'you'
RETURN a.prop

977
• Start a subquery on a new line after the opening brace, indented with two (additional) spaces. Leave
the closing brace on its own line.

Bad

MATCH (a:A)
WHERE
EXISTS { MATCH (a)-->(b:B) WHERE b.prop = 'yellow' }
RETURN a.foo

Also bad

MATCH (a:A)
WHERE EXISTS
{MATCH (a)-->(b:B)
WHERE b.prop = 'yellow'}
RETURN a.foo

Good

MATCH (a:A)
WHERE EXISTS {
MATCH (a)-->(b:B)
WHERE b.prop = 'yellow'
}
RETURN a.foo

• Do not break the line if the simplified subquery form is used.

Bad

MATCH (a:A)
WHERE EXISTS {
(a)-->(b:B)
}
RETURN a.prop

Good

MATCH (a:A)
WHERE EXISTS { (a)-->(b:B) }
RETURN a.prop

Casing
• Write keywords in upper case.

Bad

match (p:Person)
where p.name starts with 'Ma'
return p.name

Good

MATCH (p:Person)
WHERE p.name STARTS WITH 'Ma'
RETURN p.name

978
• Write the value null in lower case.

Bad

WITH NULL AS n1, Null AS n2


RETURN n1 IS NULL AND n2 IS NOT NULL

Good

WITH null AS n1, null AS n2


RETURN n1 IS NULL AND n2 IS NOT NULL

• Write BOOLEAN literals (true and false) in lower case.

Bad

WITH TRUE AS b1, False AS b2


RETURN b1 AND b2

Good

WITH true AS b1, false AS b2


RETURN b1 AND b2

• Use camel case, starting with a lower-case character, for:


◦ functions

◦ properties

◦ variables

◦ parameters

Bad

CREATE (N:Label {Prop: 0})


WITH N, RAND() AS Rand, $pArAm AS MAP
RETURN Rand, MAP.property_key, count(N)

Good

CREATE (n:Label {prop: 0})


WITH n, rand() AS rand, $param AS map
RETURN rand, map.propertyKey, count(n)

Spacing
• For literal maps:
◦ No space between the opening brace and the first key

◦ No space between key and colon

◦ One space between colon and value

◦ No space between value and comma

979
◦ One space between comma and next key

◦ No space between the last value and the closing brace

Bad

WITH { key1 :'value' ,key2 : 42 } AS map


RETURN map

Good

WITH {key1: 'value', key2: 42} AS map


RETURN map

• One space between label/type predicates and property predicates in patterns.

Bad

MATCH (p:Person{property: -1})-[:KNOWS {since: 2016}]->()


RETURN p.name

Good

MATCH (p:Person {property: -1})-[:KNOWS {since: 2016}]->()


RETURN p.name

• No space in patterns.

Bad

MATCH (:Person) --> (:Vehicle)


RETURN count(*)

Good

MATCH (:Person)-->(:Vehicle)
RETURN count(*)

• Use a wrapping space around operators.

Bad

MATCH p=(s)-->(e)
WHERE s.name<>e.name
RETURN length(p)

Good

MATCH p = (s)-->(e)
WHERE s.name <> e.name
RETURN length(p)

• No space in label predicates.

980
Bad

MATCH (person : Person : Owner )


RETURN person.name

Good

MATCH (person:Person:Owner)
RETURN person.name

• Use a space after each comma in lists and enumerations.

Bad

MATCH (),()
WITH ['a','b',3.14] AS list
RETURN list,2,3,4

Good

MATCH (), ()
WITH ['a', 'b', 3.14] AS list
RETURN list, 2, 3, 4

• No padding space within function call parentheses.

Bad

RETURN split( 'original', 'i' )

Good

RETURN split('original', 'i')

• Use padding space within simple subquery expressions.

Bad

MATCH (a:A)
WHERE EXISTS {(a)-->(b:B)}
RETURN a.prop

Good

MATCH (a:A)
WHERE EXISTS { (a)-->(b:B) }
RETURN a.prop

Patterns
• When patterns wrap lines, break after arrows, not before.

981
Bad

MATCH (:Person)-->(vehicle:Car)-->(:Company)
<--(:Country)
RETURN count(vehicle)

Good

MATCH (:Person)-->(vehicle:Car)-->(:Company)<--
(:Country)
RETURN count(vehicle)

• Use anonymous nodes and relationships when the variable would not be used.

Bad

MATCH (kate:Person {name: 'Kate'})-[r:LIKES]-(c:Car)


RETURN c.type

Good

MATCH (:Person {name: 'Kate'})-[:LIKES]-(c:Car)


RETURN c.type

• Chain patterns together to avoid repeating variables.

Bad

MATCH (:Person)-->(vehicle:Car), (vehicle:Car)-->(:Company)


RETURN count(vehicle)

Good

MATCH (:Person)-->(vehicle:Car)-->(:Company)
RETURN count(vehicle)

• Put named nodes before anonymous nodes.

Bad

MATCH ()-->(vehicle:Car)-->(manufacturer:Company)
WHERE manufacturer.foundedYear < 2000
RETURN vehicle.mileage

Good

MATCH (manufacturer:Company)<--(vehicle:Car)<--()
WHERE manufacturer.foundedYear < 2000
RETURN vehicle.mileage

• Keep anchor nodes at the beginning of the MATCH clause.

982
Bad

MATCH (:Person)-->(vehicle:Car)-->(manufacturer:Company)
WHERE manufacturer.foundedYear < 2000
RETURN vehicle.mileage

Good

MATCH (manufacturer:Company)<--(vehicle:Car)<--(:Person)
WHERE manufacturer.foundedYear < 2000
RETURN vehicle.mileage

• Prefer outgoing (left to right) pattern relationships to incoming pattern relationships.

Bad

MATCH (:Country)-->(:Company)<--(vehicle:Car)<--(:Person)
RETURN vehicle.mileage

Good

MATCH (:Person)-->(vehicle:Car)-->(:Company)<--(:Country)
RETURN vehicle.mileage

Meta-characters
• Use single quotes, ', for literal STRING values.

Bad

RETURN "Cypher"

Good

RETURN 'Cypher'

• Disregard this rule for literal STRING values that contain a single quote character. If the STRING has both,
use the form that creates the fewest escapes. In the case of a tie, prefer single quotes.

Bad

RETURN 'Cypher\'s a nice language', "Mats' quote: \"statement\""

Good

RETURN "Cypher's a nice language", 'Mats\' quote: "statement"'

• Avoid having to use back-ticks to escape characters and keywords.

Bad

MATCH (`odd-ch@racter$`:`Spaced Label` {`&property`: 42})


RETURN labels(`odd-ch@racter$`)

983
Good

MATCH (node:NonSpacedLabel {property: 42})


RETURN labels(node)

• Do not use a semicolon at the end of the statement.

Bad

RETURN 1;

Good

RETURN 1

GQL conformance
Last updated: 24 October 2024
Neo4j version: 5.25

GQL is the new ISO International Standard query language for graph databases.

GQL has adopted much of Cypher’s query construction semantics, such as adhering to the MATCH/RETURN
format. Consequently, Cypher now accommodates most mandatory GQL features and a substantial
portion of its optional ones (defined by the ISO/IEC 39075:2024(en) - Information technology - Database
languages - GQL Standard). Users should, therefore, only expect minimal differences between crafting
queries in Cypher and GQL. For example, the following query is valid in both languages:

Cypher and GQL

MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE a.name = 'Tom Hanks'
RETURN m.title

Cypher supports the majority of mandatory GQL features. For a full list see Supported mandatory GQL
features. There are, however, currently a few mandatory GQL features not yet in Cypher that Neo4j is
actively working towards implementing. These are listed in the page Currently unsupported mandatory
GQL features.

Neo4j is also working towards increasing its support of optional GQL features. These are listed in the page
Supported optional GQL features.

Some optional GQL features not yet implemented in Cypher already have analogous Cypher equivalents.
These features are listed in the page Optional GQL features and analogous Cypher.

Additionally, Cypher contains additional features that are not part of GQL and no GQL alternatives
currently exist for them. These features are listed in the page Additional Cypher features.

984
Note on minimum GQL conformance
Following the GQL Standard subclause 24.2, Minimum conformance, Cypher’s support of the following
mandatory GQL features is explicitly declared:

• Graph with an open graph type (Feature GG01).

• The Unicode Standard version used by Cypher depends on the running JVM version. Neo4j 5 added
support for JavaSE 17 and version 13 of The Unicode Standard. Neo4j 5.14 added support for JavaSE
21 and version 15 of the Unicode Standard. For more information, see Parsing → Using Unicode in
Cypher.
[18]
• Cypher supports the following mandatory GQL property types: BOOLEAN (BOOL), FLOAT , INTEGER
[19]
(SIGNED INTEGER, or INT) , and STRING (VARCHAR).

Cypher also supports the following optional GQL property types: DATE, DURATION, LIST<INNER_TYPE NOT
[20]
NULL> (ARRAY<INNER_TYPE NOT NULL>, INNER_TYPE LIST, or INNER_TYPE ARRAY) , LOCAL DATETIME
(TIMESTAMP WITHOUT TIME ZONE), LOCAL TIME (TIME WITHOUT TIME ZONE), POINT, ZONED DATETIME (TIME
WITH TIME ZONE), and ZONED TIME (TIMESTAMP WITH TIME ZONE). For more information, see Values and
types → property types.

Supported mandatory GQL features


Unlike optional GQL features, mandatory GQL features are not assigned a GQL feature ID code. The below
table is instead listed in order of their appearance in the ISO/IEC 39075:2024(en) GQL Standard.

GQL Standard Description Documentation Comment


subclause

4.11 Graph pattern matching Patterns

4.13 GQL object types Structural types, Includes: NODE (ANY NODE, VERTEX, ANY
Types and their VERTEX) and RELATIONSHIP (ANY
synonyms. RELATIONSHIP, EDGE, ANY EDGE).

4.16 Predefined value types Property types, Includes: BOOLEAN (BOOL), FLOAT, INTEGER
Types and their (SIGNED INTEGER, INT), and STRING (
synonyms. VARCHAR).

Cypher supports the boolean type


predicate for TRUE, FALSE, and NULL but
does not support the GQL keyword
UNKNOWN.

13.2 <insert statement> INSERT

985
GQL Standard Description Documentation Comment
subclause

13.3 <set statement> SET GQL’s SET has no order dependencies


because all right-hand side operations are
completed before any assignments occur.
In Cypher’s SET, the order of rows can
affect the outcome because changes made
during execution may depend on the
sequence of assignments. The only way to
guarantee row order in Neo4j is to use
ORDER BY.

13.4 <remove statement> REMOVE

13.5 <delete statement> DELETE

14.4 <match statement> MATCH, OPTIONAL


MATCH

14.9 <order by and page ORDER BY, SKIP,


statement> OFFSET, LIMIT

14.10 <primitive result FINISH


statement>

14.11 <return statement> RETURN GQL defines the option to specify RETURN
ALL (functionally equivalent to using RETURN
on its own). This is currently not available
in Cypher.

16.2 <limit clause> LIMIT

16.4 <graph pattern> Graph patterns

16.5 <insert graph pattern> <<, CREATE>>

16.6 <order by clause> ORDER-BY

16.7 <path pattern Path patterns


expression>

986
GQL Standard Description Documentation Comment
subclause

16.8 <label expression> Label


expressions

16.9 <path variable Patterns →


reference> Syntax and
semantics

16.11 <graph pattern Quantifiers


quantifier>

16.17 <sort specification list> Order results in


ascending or
descending
order

16.19 <offset clause> SKIP, OFFSET

19.3 <comparison predicate> Comparison


operators

19.4 <exists predicate> exists()

19.5 <null predicate> Type predicate


expressions for
NULL

19.6 <value type predicate> <<>>

19.7 <normalized predicate> IS NORMALIZED,


IS NOT
NORMALIZED

20.2 <value expression Cypher


primary> expressions

20.3 <value specification> GQL defines the SESSION_USER value


expression, which enables accessing a
user’s username within a query. In Cypher,
current user details can be seen using the
SHOW CURRENT USER command.

987
GQL Standard Description Documentation Comment
subclause

20.7 <case expression> CASE, nullIf(),


coalesce()

20.9 <aggregate function> avg(), count(), Cypher and GQL handle NULL values
max, min(), differently for the sum() function when
sum() queries return 0 rows. For example, RETURN
sum(<expr>) on an empty table returns
NULL in GQL, but in Cypher it returns 0.

20.11 <property reference> Core concepts

20.21 <numeric value Mathematical


expression> operators

20.22 <numeric value function> char_length(),


character_leng
th()

20.23 <string value STRING


expression> concatenation
operator (||)

20.24 <character string left(), lower(), In GQL, TRIM() removes only space
function> normalize(), characters. In Cypher, trim() removes any
right(), trim(), whitespace character.
upper()

21.1 Names and variables Syntax Cypher supports GQL’s lexical elements,
with the following caveats:

• GQL allows for extended parameter


identifiers. For example: RETURN
$0hello is allowed in GQL but not
Cypher.

• GQL allows identifiers that are not


variables to be delimited with both
backticks and quotes. Cypher only
allows backticks. For example: MATCH
(n) RETURN n."a prop" is allowed in
GQL but not Cypher.

988
GQL Standard Description Documentation Comment
subclause

22.15 Grouping operations Counting with


and without
duplicates

Currently unsupported mandatory GQL features


Cypher supports most mandatory GQL features. There are, however, currently a few mandatory GQL
features not yet in Cypher that Neo4j is actively working towards implementing. The table below provides
an overview of these GQL features and, where applicable, their functional equivalents in Neo4j.

Unlike optional GQL features, mandatory GQL features are not assigned a GQL feature ID code. The below
table is instead listed in order of their appearance in the ISO/IEC 39075:2024(en) GQL Standard.

GQL Standard Description Comment and similar Neo4j functionality


subclause

4.9.2 GQL-status objects Exposing successful execution results, errors,


exceptions, and warnings as GQL-status objects.

7.1-7.3 Session management GQL defines the following session commands:


SESSION SET, SESSION RESET, and SESSION CLOSE.
Neo4j offers session management through the
driver session API.

8.1-8.4 Transaction management GQL defines the following transaction commands:


START TRANSACTION, COMMIT, and ROLLBACK.

Neo4j offers transaction management through the


driver transaction API. Cypher Shell also offers
specific commands to manage transactions.

11.1 Graph expressions GQL defines the following graph reference values
commands: CURRENT_GRAPH and
CURRENT_PROPERTY_GRAPH.

17.1 Schema reference GQL defines an AT clause for selecting the current
schema and the following schema selection options:
HOME_SCHEMA and CURRENT_SCHEMA.

989
GQL Standard Description Comment and similar Neo4j functionality
subclause

21.3 <token>, <separator>, and GQL specifies a list of reserved words that cannot
<identifier> be used for unquoted variable names, labels, and
property names. Cypher also specifies a list of
reserved keywords, but it differs from GQL’s.

Supported optional GQL features


This page lists the optional GQL features Cypher is either fully or partially conformant with.

Optional GQL features are assigned a feature ID code. These codes order the features in the table below.

GQL Feature ID Description Documentation Comment

G002 Different-edges match Relationship The semantic for this feature is the default
mode uniqueness in Cypher semantic.
Cypher

G004 Path variables Path patterns

G016 Any path search ANY

G017 All shortest path search ALL SHORTEST

G018 Any shortest path search ANY

G019 Counted shortest path SHORTEST


search

G020 Counted shortest group SHORTEST


search GROUPS

G035 Quantified paths Quantified path


patterns

G036 Quantified edges Quantified


relationships

G050 Parenthesized path Path patterns


pattern: WHERE clause

990
GQL Feature ID Description Documentation Comment

G051 Parenthesized path Graph patterns


pattern: non-local → Rules
predicate

G060 Bounded graph pattern Quantifiers


quantifier

G061 Unbounded graph Quantifiers


pattern quantifier

G074 Label expressions: Label


wildcard label expressions

GA06 Value type predicates Type predicate


expressions

GA07 Ordering by discarded Graph patterns


binding variables → Rules

GB01 Long identifiers Naming rules


and
recommendatio
ns → Identifier
length limit

GF01 Enhanced numeric abs(), floor(), Note the following exceptions: GQL
functions sqrt(). supports CEILING() as a synonym for the
CEIL() function. Cypher only supports
ceil().

GF02 Trigonometric functions acos(), asin(),


atan(), cos(),
cot(),
degrees(),
radians(),
tan()

991
GQL Feature ID Description Documentation Comment

GF03 Logarithmic functions exp(), log10(). Note the following exceptions:

• Cypher uses the log() function instead


of GQL’s LN() function.

• Cypher uses the exponentiation


operator (^) instead of GQL’s POWER()
function.

GF05 Multi-character trim btrim(),


functions ltrim(),
rtrim()

GF06 Explicit TRIM function trim() In GQL, TRIM() removes only space
characters. In Cypher, trim() removes any
whitespace character.

GG01 Graph with open graph


type

GP01 Inline procedure CALL subqueries

GP03 Inline procedure with CALL subqueries


explicit nested variable → Variable scope
scope clause

GP04 Named procedure calls CALL procedure

GQ01 USE graph clause USE Cypher’s USE clause supports static graph
references (e.g. USE
myComposite.myGraph)and dynamic graph
references (e.g. USE
graph.byName(<expression>)). However,
Cypher does not support GQL’s full graph
reference syntax. For example, GQL’s
graph reference values CURRENT_GRAPH and
CURRENT_PROPERTY_GRAPH cannot be used in
Cypher.

GQ03 Composite query: UNION UNION

GQ13 ORDER BY and page LIMIT, ORDER BY Cypher requires using the WITH clause,
statement: LIMIT which GQL does not.

992
GQL Feature ID Description Documentation Comment

GV39 Temporal types: date, Temporal types, Note the following exceptions:
local datetime, and local date()
time support • GQL defines a parameterless version
of the date() function not in Cypher:
CURRENT_DATE.

• GQL’s LOCAL_TIME() function is


equivalent to Cypher’s localtime()
function. GQL also defines a
parameterless version of the function
not in Cypher: LOCAL_TIME.

• GQL’s LOCAL_DATETIME() function is


equivalent to Cypher’s
localdatetime() function. GQL also
defines a parameterless version of the
function not in Cypher:
LOCAL_DATETIME.

GV40 Temporal types: zoned Temporal types Note the following exceptions:
datetime and zoned time
support • GQL’s ZONED_TIME() function is
equivalent to Cypher’s time() function.
GQL also defines a parameterless
version of the function not in Cypher:
CURRENT_TIME.

• GQL’s ZONED_DATETIME() function is


equivalent to Cypher’s datetime()
function. GQL also defines a
parameterless version of the function
not in Cypher: CURRENT_TIMESTAMP.

GV50 List value types Lists

GV55 Path value types Structural types


→ PATH

GV66 Open dynamic unions Type predicate


expressions →
ANY and NOTHING

GV67 Closed dynamic unions Closed dynamic


unions

993
GQL Feature ID Description Documentation Comment

GV70 Immaterial value types: Working with


null type support (NULL) NULL

GV71 Immaterial value types: Type predicate


empty type support expressions →
(NOTHING)] ANY and NOTHING

Cypher and GQL sometimes name functions differently and, as a result, several Cypher

 functions offer the same (or very similar) functionality to their GQL counterpart. For more
information, see the page Optional GQL features and analogous Cypher.

Optional GQL features and analogous Cypher


This page lists optional GQL features that have analogous but not identical Cypher features.

Optional GQL features are assigned a feature ID code. These codes order the features in the table below.

GQL Feature ID Description Comment and similar Cypher functionality

G100 ELEMENT_ID function GQL’s ELEMENT_ID() function is equivalent to


Cypher’s elementId() function.

GF04 Enhanced path functions GQL’s PATH_LENGTH() function is equivalent to


Cypher’s length() function.

GF10 Advanced aggregate • GQL’s COLLECT_LIST() function is equivalent to


functions: general set Cypher’s collect() function.
functions
• GQL’s STDEV_SAMP() function is equivalent to
Cypher’s stDev() function.

• GQL’s STDEV_POP() function is equivalent to


Cypher’s stDevP() function.

GF11 Advanced aggregate • GQL’s PERCENTILE_CONT() function is equivalent


functions: binary set functions to Cypher’s percentileCont() function.

• GQL’s PERCENTILE_DISC() function is equivalent


to Cypher’s percentileDisc() function.

GQ08 FILTER statement Selects a subset of the records of the current


working table. Cypher uses WITH instead.

994
GQL Feature ID Description Comment and similar Cypher functionality

GQ09 LET statement Adds columns to the current working table. Cypher
uses WITH instead.

GQ10, GQ11, FOR statement: list value Unnests a list or a binding table by expanding the
GQ23, GQ24 support, binding table current working table. Cypher uses UNWIND instead.
support, WITH ORDINALITY, Unlike the FOR statement, UNWIND does not support
WITH OFFSET yielding indexes and offsets.

GV12 64-bit signed integer numbers GQL’s SIGNED INTEGER64 (alternatively: INTEGER64,
INT64) type is equivalent to Cypher’s INTEGER type.

GV23 Floating point type name GQL’s DOUBLE type is equivalent to Cypher’s FLOAT
synonyms type.

GV24 64-bit floating number GQL’s FLOAT64 type is equivalent to Cypher’s FLOAT
type.

GV45 Record types GQL’s open RECORD type is equivalent to the MAP
type in Cypher.

Additional Cypher features


While the GQL Standard incorporates a lot of capabilities in Cypher, Cypher contains additional features
that are not part of GQL and no GQL alternatives currently exist for them. This page covers those Cypher
features.

Clauses

Cypher feature Description

LOAD CSV Import data from CSV files.

MERGE Ensures that a pattern exists in the graph. Either the pattern already exists,
or it needs to be created.

Subqueries

Cypher feature Description

CALL { … } IN CALL subqueries executed in separate, inner transactions, producing


TRANSACTIONS intermediate commits.

995
Cypher feature Description

COLLECT Used to create a list with the rows returned by a subquery.

COUNT Used to count the number of rows returned by a subquery.

EXISTS Used to discover if a specified pattern exists at least once in the graph

Values and types

Cypher feature Description

POINT values Spatial values.

MAP values. Map values - the GQL equivalent is Records.

Comprehensions and projections

Cypher feature Description

List comprehension Syntactic construct for creating a LIST based on existing lists.

Map projection Constructs MAP projections from nodes, relationships, and other MAP values.

Pattern comprehension Syntactic construct for creating a LIST based on matchings of a pattern.

Functions

Database functions

Cypher feature Description

db.nameFromElementId() Resolves the database name for the given element id.

GenAI functions

Cypher feature Description

genai.vector.encode() Generates a vector embedding for a single value.

996
Graph functions

Cypher feature Description

graph.byElementId() Returns the graph reference with the given element id. It is only supported
in the USE clause, on composite databases.

graph.byName() Returns the graph reference of the given name. It is only supported in the
USE clause, on composite databases.

graph.names() Lists the names of graphs in the current database.

graph.propertiesByName() Returns the MAP of properties associated with a graph.

List functions

Cypher feature Description

keys() Returns a LIST<STRING> containing the STRING representations for all the
property names of a NODE, RELATIONSHIP, or MAP.

labels() Returns a LIST<STRING> containing the STRING representations for all the
labels of a NODE.

nodes() Returns a LIST<NODE> containing all the NODE values in a PATH.

range() Returns a LIST<INTEGER> comprising all INTEGER values within a specified


range.

reduce() Runs an expression against individual elements of a LIST<ANY>, storing the


result of the expression in an accumulator.

relationships() Returns a LIST<RELATIONSHIP> containing all the RELATIONSHIP values in a


PATH.

reverse() Returns a STRING or LIST<ANY> in which the order of all characters or


elements in the given STRING or LIST<ANY> have been reversed.

tail() Returns all but the first element in a LIST<ANY>.

toBooleanList() Converts a LIST<ANY> of values to a LIST<BOOLEAN> values. If any values are


not convertible to BOOLEAN they will be null in the LIST<BOOLEAN> returned.

997
Cypher feature Description

toFloatList() Converts a LIST<ANY> to a LIST<FLOAT> values. If any values are not


convertible to FLOAT they will be null in the LIST<FLOAT> returned.

toIntegerList() Converts a LIST<ANY> to a LIST<INTEGER> values. If any values are not


convertible to INTEGER they will be null in the LIST<INTEGER> returned.

toStringList() Converts an INTEGER, FLOAT, BOOLEAN, POINT or temporal type (i.e. DATE,
ZONED TIME, LOCAL TIME, ZONED DATETIME, LOCAL DATETIME or DURATION)
value to a STRING, or null if the value cannot be converted.

LOAD CSV functions

Cypher feature Description

file() Returns the absolute path of the file that LOAD CSV is using.

lineNumber Returns the line number that LOAD CSV is currently using.

Logarithmic functions

Cypher feature Description

e() Returns the base of the natural logarithm, e.

Numeric functions

Cypher feature Description

isNaN() Returns whether the given INTEGER or FLOAT is NaN.

rand() Returns a random FLOAT in the range from 0 (inclusive) to 1 (exclusive).

round() Returns the value of a number rounded to the nearest INTEGER.

sign() Returns the signum of an INTEGER or FLOAT: 0 if the number is 0, -1 for any
negative number, and 1 for any positive number.

998
Predicate functions

Cypher feature Description

all() Returns true if the predicate holds for all elements in the given LIST<ANY>.

any() Returns true if the predicate holds for at least one element in the given
LIST<ANY>.

isEmpty() Checks whether a STRING, MAP, or LIST<ANY> is empty.

none() Returns true if the predicate holds for no element in the given LIST<ANY>.

single() Returns true if the predicate holds for exactly one of the elements in the
given LIST<ANY>.

Scalar functions

Cypher feature Description

endNode() Returns the end NODE of a RELATIONSHIP.

head() Returns the first element in a LIST<ANY>.

last() Returns the last element in a LIST<ANY>.

properties() Returns a MAP containing all the properties of a NODE, RELATIONSHIP, or MAP.

randomUUID() Generates a random UUID.

startNode() Returns the start NODE of a RELATIONSHIP.

type() Returns a STRING representation of the RELATIONSHIP type.

valueType() Returns a STRING representation of the most precise value type that the
given expression evaluates to.

Spatial functions

999
Cypher feature Description

point() Returns a 2D or 3D point object, given two or respectively three


coordinate values in the Cartesian coordinate system or WGS 84
geographic coordinate system.

point.distance() Returns a FLOAT representing the distance between any two points in the
same CRS. If the points are in the WGS 84 CRS, the function returns the
geodesic distance (i.e., the shortest path along the curved surface of the
Earth). If the points are in a Cartesian CRS, the function returns the
Euclidean distance (i.e., the shortest straight-line distance in a flat, planar
space).

point.withinBBox() Returns true if the provided point is within the bounding box defined by
the two provided points.

String functions

Cypher feature Description

replace() Returns a STRING in which all occurrences of a specified search STRING in


the given STRING have been replaced by another (specified) replacement
STRING.

reverse() Returns a STRING or LIST<ANY> in which the order of all characters or


elements in the given STRING or LIST<ANY> have been reversed.

split() Returns a LIST<STRING> resulting from the splitting of the given STRING
around matches of the given delimiter(s).

substring() Returns a substring of the given STRING, beginning with a 0-based index
start.

Trigonometric functions

Cypher feature Description

atan2() Returns the arctangent2 of a set of coordinates in radians.

haversin() Returns half the versine of a number.

1000
Cypher feature Description

pi() Returns the mathematical constant pi.

Temporal duration functions

Cypher feature Description

duration.inDays() Computes the DURATION between the from instant (inclusive) and the to
instant (exclusive) in days.

duration.inMonths() Computes the DURATION between the from instant (inclusive) and the to
instant (exclusive) in months.

duration.inSeconds() Computes the DURATION between the from instant (inclusive) and the to
instant (exclusive) in seconds.

Temporal instant functions

Cypher feature Description

date.realtime() Returns the current DATE instant using the realtime clock.

date.statement() Returns the current DATE instant using the statement clock.

date.transaction() Returns the current DATE instant using the transaction clock.

date.truncate() Returns the current DATE instant using the transaction clock.

datetime.fromEpoch() Creates a ZONED DATETIME given the seconds and nanoseconds since the
start of the epoch.

datetime.fromEpochMillis() Creates a ZONED DATETIME given the milliseconds since the start of the
epoch.

datetime.realtime() Returns the current ZONED DATETIME instant using the realtime clock.

datetime.statement() Returns the current ZONED DATETIME instant using the statement clock.

datetime.transaction() Returns the current ZONED DATETIME instant using the transaction clock.

1001
Cypher feature Description

datetime.truncate() Truncates the given temporal value to a ZONED DATETIME instant using the
specified unit.

localdatetime.realtime() Returns the current LOCAL DATETIME instant using the realtime clock.

localdatetime.statement() Returns the current LOCAL DATETIME instant using the statement clock.

localdatetime.transaction( Returns the current LOCAL DATETIME instant using the transaction clock.
)

localdatetime.truncate() Truncates the given temporal value to a LOCAL DATETIME instant using the
specified unit.

localtime.realtime() Returns the current LOCAL TIME instant using the realtime clock.

localtime.statement() Returns the current LOCAL TIME instant using the statement clock.

localtime.transaction() Returns the current LOCAL TIME instant using the transaction clock.

localtime.truncate() Truncates the given temporal value to a LOCAL TIME instant using the
specified unit.

time.realtime() Returns the current ZONED TIME instant using the realtime clock.

time.statement() Returns the current ZONED TIME instant using the statement clock.

time.transaction() Returns the current ZONED TIME instant using the transaction clock.

time.truncate() Truncates the given temporal value to a ZONED TIME instant using the
specified unit.

Vector functions

Cypher feature Description

vector.similarity.cosine() Returns a FLOAT representing the similarity between the argument vectors
based on their cosine.

1002
Cypher feature Description

vector.similarity.euclidea Returns a FLOAT representing the similarity between the argument vectors
n() based on their Euclidean distance.

Indexes

Cypher feature Description

Range indexes Neo4j’s default index. Supports most types of predicates.

Text indexes Solves predicates operating on STRING values. Optimized for queries
filtering with the STRING operators CONTAINS and ENDS WITH.

Point indexes Solves predicates on spatial POINT values. Optimized for queries filtering
on distance or within bounding boxes.

Token lookup indexes Only solves node label and relationship type predicates (i.e. they cannot
solve any predicates filtering on properties).

Full text indexes Enables searching within the content of STRING properties and for similarity
comparisons between query strings and STRING values stored in the
database.

Vector indexes Enables similarity searches and complex analytical queries by representing
nodes or properties as vectors in a multidimensional space.

Index hints Cypher allows for index hints to influence the planner when creating
execution plans. Index hints are specified with the USING keyword.

Constraints
GQL supports GRAPH TYPES as a way of constraining a graph schema, but does not support individual
constraints.

Cypher feature Description

Property uniqueness Ensures that the combined property values are unique for all nodes with a
constraints specific label or all relationships with a specific type.

Property existence Ensures that a property exists either for all nodes with a specific label or
constraints for all relationships with a specific type.

1003
Cypher feature Description

Property type constraints Ensures that a property has the required property type for all nodes with a
specific label or for all relationships with a specific type.

Key constraints Ensures that all properties exist and that the combined property values are
unique for all nodes with a specific label or all relationships with a specific
type.

Operators

Cypher feature Description

STARTS WITH, CONTAINS, ENDS STRING comparison operators.


WITH, and regular
expressions.

IN IN predicate for LIST values.

Query optimization

Cypher feature Description

EXPLAIN/PROFILE Optionally prepended to queries to produce execution plans. EXPLAIN will


only generate an execution plan but not run the query; PROFILE will do
both.

CYPHER runtime=parallel Cypher allows for setting the runtime of queries, determining how the
query will be executed. The available Cypher runtimes are: slotted,
pipelined, parallel.

CYPHER Cypher allows for setting numerous query options. For more information,
inferSchemaParts=off see Query options.

Administration

The documentation for Cypher’s administration commands is located in Neo4j’s


 Operation Manual.

Cypher feature Description

Database management Commands to CREATE, SHOW, ALTER, and DROP standard and composite
databases.

1004
Cypher feature Description

Alias management Commands to CREATE, SHOW, ALTER, and DROP database aliases.

Server management Commands to administer servers in a cluster and the databases allocated
to them.

Authentication and Commands to manage users, roles, and privileges.


authorization

Tutorials and extended examples


• Basic query tuning example

• Advanced query tuning example

• Shortest path planning - information about how to plan queries using the shortestPath() function.

Basic query tuning example


This page describes how to profile a query by using optimizations based on native index capabilities.

The data set


In this section, examples demonstrates the impact native indexes can have on query performance under
certain conditions. You will use a movies dataset to illustrate this more advanced query tuning.

In this tutorial, you import data from the following CSV files:

• movies.csv

• actors.csv

• directors.csv

Movies

The movies.csv file contains two columns title, released, and tagline.

The content of the movies.csv file:

movies.csv

title,released,tagline
Something's Gotta Give,1975,null
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
The Devil's Advocate,1997,Evil has its winning ways
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
The Matrix Revolutions,2003,Everything that has a beginning has an end

1005
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
V for Vendetta,2006,Freedom! Forever!
Cloud Atlas,2012,Everything is connected
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
Speed Racer,2008,Speed has no limits
Cloud Atlas,2012,Everything is connected
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
Ninja Assassin,2009,Prepare to enter a secret world of assassins
V for Vendetta,2006,Freedom! Forever!
Speed Racer,2008,Speed has no limits
V for Vendetta,2006,Freedom! Forever!
Speed Racer,2008,Speed has no limits
Cloud Atlas,2012,Everything is connected
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
Ninja Assassin,2009,Prepare to enter a secret world of assassins
V for Vendetta,2006,Freedom! Forever!
Speed Racer,2008,Speed has no limits
V for Vendetta,2006,Freedom! Forever!
Ninja Assassin,2009,Prepare to enter a secret world of assassins
Speed Racer,2008,Speed has no limits
V for Vendetta,2006,Freedom! Forever!
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
The Matrix,1999,Welcome to the Real World
That Thing You Do,1996,In every life there comes a time when that thing you dream becomes that thing you
do
The Devil's Advocate,1997,Evil has its winning ways
The Devil's Advocate,1997,Evil has its winning ways
The Devil's Advocate,1997,Evil has its winning ways
Jerry Maguire,2000,The rest of his life begins now.
Top Gun,1986,"I feel the need, the need for speed."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Something's Gotta Give,1975,null
One Flew Over the Cuckoo's Nest,1975,"If he's crazy, what does that make you?"
Hoffa,1992,He didn't want law. He wanted justice.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Apollo 13,1995,"Houston, we have a problem."
Frost/Nixon,2008,400 million people were waiting for the truth.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
What Dreams May Come,1998,After life there is more. The end is just the beginning.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
Jerry Maguire,2000,The rest of his life begins now.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Hoffa,1992,He didn't want law. He wanted justice.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Ninja Assassin,2009,Prepare to enter a secret world of assassins
V for Vendetta,2006,Freedom! Forever!
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man

1006
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
When Harry Met Sally,1998,At odds in life... in love on-line.
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
When Harry Met Sally,1998,At odds in life... in love on-line.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
When Harry Met Sally,1998,At odds in life... in love on-line.
Joe Versus the Volcano,1990,"A story of love, lava and burning desire."
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
You've Got Mail,1998,At odds in life... in love on-line.
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
The Green Mile,1999,Walk a mile you'll never forget.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Cast Away,2000,"At the edge of the world, his journey begins."
Twister,1996,Don't Breathe. Don't Look Back.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
You've Got Mail,1998,At odds in life... in love on-line.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
What Dreams May Come,1998,After life there is more. The end is just the beginning.
Snow Falling on Cedars,1999,First loves last. Forever.
What Dreams May Come,1998,After life there is more. The end is just the beginning.
What Dreams May Come,1998,After life there is more. The end is just the beginning.
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
Bicentennial Man,1999,One robot's 200 year journey to become an ordinary man.
The Birdcage,1996,Come as you are
What Dreams May Come,1998,After life there is more. The end is just the beginning.
What Dreams May Come,1998,After life there is more. The end is just the beginning.
Snow Falling on Cedars,1999,First loves last. Forever.
Ninja Assassin,2009,Prepare to enter a secret world of assassins
Snow Falling on Cedars,1999,First loves last. Forever.
The Green Mile,1999,Walk a mile you'll never forget.
Snow Falling on Cedars,1999,First loves last. Forever.
Snow Falling on Cedars,1999,First loves last. Forever.
You've Got Mail,1998,At odds in life... in love on-line.
You've Got Mail,1998,At odds in life... in love on-line.
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
You've Got Mail,1998,At odds in life... in love on-line.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
The Polar Express,2004,This Holiday Season… Believe
Charlie Wilson's War,2007,A stiff drink. A little mascara. A lot of nerve. Who said they couldn't bring
down the Soviet empire.
Cast Away,2000,"At the edge of the world, his journey begins."

1007
Apollo 13,1995,"Houston, we have a problem."
The Green Mile,1999,Walk a mile you'll never forget.
The Da Vinci Code,2006,Break The Codes
Cloud Atlas,2012,Everything is connected
That Thing You Do,1996,In every life there comes a time when that thing you dream becomes that thing you
do
Joe Versus the Volcano,1990,"A story of love, lava and burning desire."
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
You've Got Mail,1998,At odds in life... in love on-line.
That Thing You Do,1996,In every life there comes a time when that thing you dream becomes that thing you
do
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
You've Got Mail,1998,At odds in life... in love on-line.
When Harry Met Sally,1998,At odds in life... in love on-line.
When Harry Met Sally,1998,At odds in life... in love on-line.
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
Joe Versus the Volcano,1990,"A story of love, lava and burning desire."
The Birdcage,1996,Come as you are
Joe Versus the Volcano,1990,"A story of love, lava and burning desire."
When Harry Met Sally,1998,At odds in life... in love on-line.
When Harry Met Sally,1998,At odds in life... in love on-line.
When Harry Met Sally,1998,At odds in life... in love on-line.
That Thing You Do,1996,In every life there comes a time when that thing you dream becomes that thing you
do
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
Unforgiven,1992,"It's a hell of a thing, killing a man"
The Birdcage,1996,Come as you are
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
Twister,1996,Don't Breathe. Don't Look Back.
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
Charlie Wilson's War,2007,A stiff drink. A little mascara. A lot of nerve. Who said they couldn't bring
down the Soviet empire.
The Birdcage,1996,Come as you are
Unforgiven,1992,"It's a hell of a thing, killing a man"
Unforgiven,1992,"It's a hell of a thing, killing a man"
Unforgiven,1992,"It's a hell of a thing, killing a man"
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
Cloud Atlas,2012,Everything is connected
Cloud Atlas,2012,Everything is connected
Cloud Atlas,2012,Everything is connected
The Da Vinci Code,2006,Break The Codes
The Da Vinci Code,2006,Break The Codes
The Da Vinci Code,2006,Break The Codes
Apollo 13,1995,"Houston, we have a problem."
Frost/Nixon,2008,400 million people were waiting for the truth.
The Da Vinci Code,2006,Break The Codes
V for Vendetta,2006,Freedom! Forever!
V for Vendetta,2006,Freedom! Forever!
V for Vendetta,2006,Freedom! Forever!
Ninja Assassin,2009,Prepare to enter a secret world of assassins
Speed Racer,2008,Speed has no limits
V for Vendetta,2006,Freedom! Forever!
Speed Racer,2008,Speed has no limits
Speed Racer,2008,Speed has no limits
Speed Racer,2008,Speed has no limits
Speed Racer,2008,Speed has no limits
Speed Racer,2008,Speed has no limits
Ninja Assassin,2009,Prepare to enter a secret world of assassins
Speed Racer,2008,Speed has no limits
Ninja Assassin,2009,Prepare to enter a secret world of assassins
The Green Mile,1999,Walk a mile you'll never forget.

1008
The Green Mile,1999,Walk a mile you'll never forget.
Frost/Nixon,2008,400 million people were waiting for the truth.
The Green Mile,1999,Walk a mile you'll never forget.
Apollo 13,1995,"Houston, we have a problem."
The Green Mile,1999,Walk a mile you'll never forget.
The Green Mile,1999,Walk a mile you'll never forget.
The Green Mile,1999,Walk a mile you'll never forget.
Frost/Nixon,2008,400 million people were waiting for the truth.
Frost/Nixon,2008,400 million people were waiting for the truth.
Bicentennial Man,1999,One robot's 200 year journey to become an ordinary man.
Frost/Nixon,2008,400 million people were waiting for the truth.
One Flew Over the Cuckoo's Nest,1975,"If he's crazy, what does that make you?"
Hoffa,1992,He didn't want law. He wanted justice.
Hoffa,1992,He didn't want law. He wanted justice.
Hoffa,1992,He didn't want law. He wanted justice.
Apollo 13,1995,"Houston, we have a problem."
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
Twister,1996,Don't Breathe. Don't Look Back.
Apollo 13,1995,"Houston, we have a problem."
Charlie Wilson's War,2007,A stiff drink. A little mascara. A lot of nerve. Who said they couldn't bring
down the Soviet empire.
Twister,1996,Don't Breathe. Don't Look Back.
Twister,1996,Don't Breathe. Don't Look Back.
The Polar Express,2004,This Holiday Season… Believe
Cast Away,2000,"At the edge of the world, his journey begins."
One Flew Over the Cuckoo's Nest,1975,"If he's crazy, what does that make you?"
Something's Gotta Give,1975,null
Something's Gotta Give,1975,null
Something's Gotta Give,1975,null
Something's Gotta Give,1975,null
Bicentennial Man,1999,One robot's 200 year journey to become an ordinary man.
Charlie Wilson's War,2007,A stiff drink. A little mascara. A lot of nerve. Who said they couldn't bring
down the Soviet empire.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
The Da Vinci Code,2006,Break The Codes
The Birdcage,1996,Come as you are
Unforgiven,1992,"It's a hell of a thing, killing a man"
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
Cloud Atlas,2012,Everything is connected
The Da Vinci Code,2006,Break The Codes
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"

Actors

The actors.csv file contains two columns title, roles, name, and born.

The content of the actors.csv file:

actors.csv

title,roles,name,born
Something's Gotta Give,Julian Mercer,Keanu Reeves,1964
Johnny Mnemonic,Johnny Mnemonic,Keanu Reeves,1964
The Replacements,Shane Falco,Keanu Reeves,1964
The Devil's Advocate,Kevin Lomax,Keanu Reeves,1964
The Matrix Revolutions,Neo,Keanu Reeves,1964
The Matrix Reloaded,Neo,Keanu Reeves,1964
The Matrix,Neo,Keanu Reeves,1964
The Matrix Revolutions,Trinity,Carrie-Anne Moss,1967
The Matrix Reloaded,Trinity,Carrie-Anne Moss,1967
The Matrix,Trinity,Carrie-Anne Moss,1967
The Matrix Revolutions,Morpheus,Laurence Fishburne,1961
The Matrix Reloaded,Morpheus,Laurence Fishburne,1961
The Matrix,Morpheus,Laurence Fishburne,1961
V for Vendetta,V,Hugo Weaving,1960
Cloud Atlas,Bill Smoke;Haskell Moore;Tadeusz Kesselring;Nurse Noakes;Boardman Mephi;Old Georgie,Hugo

1009
Weaving,1960
The Matrix Revolutions,Agent Smith,Hugo Weaving,1960
The Matrix Reloaded,Agent Smith,Hugo Weaving,1960
The Matrix,Agent Smith,Hugo Weaving,1960
The Matrix,Emil,Emil Eifrem,1978
That Thing You Do,Tina,Charlize Theron,1975
The Devil's Advocate,Mary Ann Lomax,Charlize Theron,1975
The Devil's Advocate,John Milton,Al Pacino,1940
Jerry Maguire,Jerry Maguire,Tom Cruise,1962
Top Gun,Maverick,Tom Cruise,1962
A Few Good Men,Lt. Daniel Kaffee,Tom Cruise,1962
Something's Gotta Give,Harry Sanborn,Jack Nicholson,1937
One Flew Over the Cuckoo's Nest,Randle McMurphy,Jack Nicholson,1937
Hoffa,Hoffa,Jack Nicholson,1937
As Good as It Gets,Melvin Udall,Jack Nicholson,1937
A Few Good Men,Col. Nathan R. Jessup,Jack Nicholson,1937
A Few Good Men,Lt. Cdr. JoAnne Galloway,Demi Moore,1962
Apollo 13,Jack Swigert,Kevin Bacon,1958
Frost/Nixon,Jack Brennan,Kevin Bacon,1958
A Few Good Men,Capt. Jack Ross,Kevin Bacon,1958
Stand By Me,Ace Merrill,Kiefer Sutherland,1966
A Few Good Men,Lt. Jonathan Kendrick,Kiefer Sutherland,1966
A Few Good Men,Cpl. Jeffrey Barnes,Noah Wyle,1971
What Dreams May Come,Albert Lewis,Cuba Gooding Jr.,1968
As Good as It Gets,Frank Sachs,Cuba Gooding Jr.,1968
Jerry Maguire,Rod Tidwell,Cuba Gooding Jr.,1968
A Few Good Men,Cpl. Carl Hammaker,Cuba Gooding Jr.,1968
A Few Good Men,Lt. Sam Weinberg,Kevin Pollak,1957
Hoffa,Frank Fitzsimmons,J.T. Walsh,1943
A Few Good Men,Lt. Col. Matthew Andrew Markinson,J.T. Walsh,1943
A Few Good Men,Pfc. Louden Downey,James Marshall,1967
A Few Good Men,Dr. Stone,Christopher Guest,1948
A Few Good Men,Man in Bar,Aaron Sorkin,1961
Top Gun,Charlie,Kelly McGillis,1957
Top Gun,Iceman,Val Kilmer,1959
Top Gun,Goose,Anthony Edwards,1962
Top Gun,Viper,Tom Skerritt,1933
When Harry Met Sally,Sally Albright,Meg Ryan,1961
Joe Versus the Volcano,DeDe;Angelica Graynamore;Patricia Graynamore,Meg Ryan,1961
Sleepless in Seattle,Annie Reed,Meg Ryan,1961
You've Got Mail,Kathleen Kelly,Meg Ryan,1961
Top Gun,Carole,Meg Ryan,1961
Jerry Maguire,Dorothy Boyd,Renee Zellweger,1969
Jerry Maguire,Avery Bishop,Kelly Preston,1962
Stand By Me,Vern Tessio,Jerry O'Connell,1974
Jerry Maguire,Frank Cushman,Jerry O'Connell,1974
Jerry Maguire,Bob Sugar,Jay Mohr,1970
The Green Mile,Jan Edgecomb,Bonnie Hunt,1961
Jerry Maguire,Laurel Boyd,Bonnie Hunt,1961
Jerry Maguire,Marcee Tidwell,Regina King,1971
Jerry Maguire,Ray Boyd,Jonathan Lipnicki,1990
Stand By Me,Chris Chambers,River Phoenix,1970
Stand By Me,Teddy Duchamp,Corey Feldman,1971
Stand By Me,Gordie Lachance,Wil Wheaton,1972
Stand By Me,Denny Lachance,John Cusack,1966
RescueDawn,Admiral,Marshall Bell,1942
Stand By Me,Mr. Lachance,Marshall Bell,1942
Cast Away,Kelly Frears,Helen Hunt,1963
Twister,Dr. Jo Harding,Helen Hunt,1963
As Good as It Gets,Carol Connelly,Helen Hunt,1963
You've Got Mail,Frank Navasky,Greg Kinnear,1963
As Good as It Gets,Simon Bishop,Greg Kinnear,1963
What Dreams May Come,Simon Bishop,Annabella Sciorra,1960
Snow Falling on Cedars,Nels Gudmundsson,Max von Sydow,1929
What Dreams May Come,The Tracker,Max von Sydow,1929
What Dreams May Come,The Face,Werner Herzog,1942
Bicentennial Man,Andrew Marin,Robin Williams,1951
The Birdcage,Armand Goldman,Robin Williams,1951
What Dreams May Come,Chris Nielsen,Robin Williams,1951
Snow Falling on Cedars,Ishmael Chambers,Ethan Hawke,1970
Ninja Assassin,Takeshi,Rick Yune,1971
Snow Falling on Cedars,Kazuo Miyamoto,Rick Yune,1971
The Green Mile,Warden Hal Moores,James Cromwell,1940
Snow Falling on Cedars,Judge Fielding,James Cromwell,1940
You've Got Mail,Patricia Eden,Parker Posey,1968
You've Got Mail,Kevin Jackson,Dave Chappelle,1973
RescueDawn,Duane,Steve Zahn,1967

1010
You've Got Mail,George Pappas,Steve Zahn,1967
A League of Their Own,Jimmy Dugan,Tom Hanks,1956
The Polar Express,Hero Boy;Father;Conductor;Hobo;Scrooge;Santa Claus,Tom Hanks,1956
Charlie Wilson's War,Rep. Charlie Wilson,Tom Hanks,1956
Cast Away,Chuck Noland,Tom Hanks,1956
Apollo 13,Jim Lovell,Tom Hanks,1956
The Green Mile,Paul Edgecomb,Tom Hanks,1956
The Da Vinci Code,Dr. Robert Langdon,Tom Hanks,1956
Cloud Atlas,Zachry;Dr. Henry Goose;Isaac Sachs;Dermot Hoggins,Tom Hanks,1956
That Thing You Do,Mr. White,Tom Hanks,1956
Joe Versus the Volcano,Joe Banks,Tom Hanks,1956
Sleepless in Seattle,Sam Baldwin,Tom Hanks,1956
You've Got Mail,Joe Fox,Tom Hanks,1956
Sleepless in Seattle,Suzy,Rita Wilson,1956
Sleepless in Seattle,Walter,Bill Pullman,1953
Sleepless in Seattle,Greg,Victor Garber,1949
A League of Their Own,Doris Murphy,Rosie O'Donnell,1962
Sleepless in Seattle,Becky,Rosie O'Donnell,1962
The Birdcage,Albert Goldman,Nathan Lane,1956
Joe Versus the Volcano,Baw,Nathan Lane,1956
When Harry Met Sally,Harry Burns,Billy Crystal,1948
When Harry Met Sally,Marie,Carrie Fisher,1956
When Harry Met Sally,Jess,Bruno Kirby,1949
That Thing You Do,Faye Dolan,Liv Tyler,1977
The Replacements,Annabelle Farrell,Brooke Langton,1970
Unforgiven,Little Bill Daggett,Gene Hackman,1930
The Birdcage,Sen. Kevin Keeley,Gene Hackman,1930
The Replacements,Jimmy McGinty,Gene Hackman,1930
The Replacements,Clifford Franklin,Orlando Jones,1968
RescueDawn,Dieter Dengler,Christian Bale,1974
Twister,Eddie,Zach Grenier,1954
RescueDawn,Squad Leader,Zach Grenier,1954
Unforgiven,English Bob,Richard Harris,1930
Unforgiven,Bill Munny,Clint Eastwood,1930
Johnny Mnemonic,Takahashi,Takeshi Kitano,1947
Johnny Mnemonic,Jane,Dina Meyer,1968
Johnny Mnemonic,J-Bone,Ice-T,1958
Cloud Atlas,Luisa Rey;Jocasta Ayrs;Ovid;Meronym,Halle Berry,1966
Cloud Atlas,Vyvyan Ayrs;Captain Molyneux;Timothy Cavendish,Jim Broadbent,1949
The Da Vinci Code,Sir Leight Teabing,Ian McKellen,1939
The Da Vinci Code,Sophie Neveu,Audrey Tautou,1976
The Da Vinci Code,Silas,Paul Bettany,1971
V for Vendetta,Evey Hammond,Natalie Portman,1981
V for Vendetta,Eric Finch,Stephen Rea,1946
V for Vendetta,High Chancellor Adam Sutler,John Hurt,1940
Ninja Assassin,Ryan Maslow,Ben Miles,1967
Speed Racer,Cass Jones,Ben Miles,1967
V for Vendetta,Dascomb,Ben Miles,1967
Speed Racer,Speed Racer,Emile Hirsch,1985
Speed Racer,Pops,John Goodman,1960
Speed Racer,Mom,Susan Sarandon,1946
Speed Racer,Racer X,Matthew Fox,1966
Speed Racer,Trixie,Christina Ricci,1980
Ninja Assassin,Raizo,Rain,1982
Speed Racer,Taejo Togokahn,Rain,1982
Ninja Assassin,Mika Coretti,Naomie Harris,null
The Green Mile,John Coffey,Michael Clarke Duncan,1957
The Green Mile,Brutus 'Brutal' Howell,David Morse,1953
Frost/Nixon,"James Reston, Jr.",Sam Rockwell,1968
The Green Mile,'Wild Bill' Wharton,Sam Rockwell,1968
Apollo 13,Ken Mattingly,Gary Sinise,1955
The Green Mile,Burt Hammersmith,Gary Sinise,1955
The Green Mile,Melinda Moores,Patricia Clarkson,1959
Frost/Nixon,Richard Nixon,Frank Langella,1938
Frost/Nixon,David Frost,Michael Sheen,1969
Bicentennial Man,Rupert Burns,Oliver Platt,1960
Frost/Nixon,Bob Zelnick,Oliver Platt,1960
One Flew Over the Cuckoo's Nest,Martini,Danny DeVito,1944
Hoffa,Robert 'Bobby' Ciaro,Danny DeVito,1944
Hoffa,Peter 'Pete' Connelly,John C. Reilly,1965
Apollo 13,Gene Kranz,Ed Harris,1950
A League of Their Own,Bob Hinson,Bill Paxton,1955
Twister,Bill Harding,Bill Paxton,1955
Apollo 13,Fred Haise,Bill Paxton,1955
Charlie Wilson's War,Gust Avrakotos,Philip Seymour Hoffman,1967
Twister,Dustin 'Dusty' Davis,Philip Seymour Hoffman,1967
Something's Gotta Give,Erica Barry,Diane Keaton,1946

1011
Charlie Wilson's War,Joanne Herring,Julia Roberts,1967
A League of Their Own,'All the Way' Mae Mordabito,Madonna,1954
A League of Their Own,Dottie Hinson,Geena Davis,1956
A League of Their Own,Kit Keller,Lori Petty,1963

Directors

The directors.csv file contains two columns title, name, and born.

The content of the directors.csv file:

directors.csv

title,name,born
Speed Racer,Andy Wachowski,1967
Cloud Atlas,Andy Wachowski,1967
The Matrix Revolutions,Andy Wachowski,1967
The Matrix Reloaded,Andy Wachowski,1967
The Matrix,Andy Wachowski,1967
Speed Racer,Lana Wachowski,1965
Cloud Atlas,Lana Wachowski,1965
The Matrix Revolutions,Lana Wachowski,1965
The Matrix Reloaded,Lana Wachowski,1965
The Matrix,Lana Wachowski,1965
The Devil's Advocate,Taylor Hackford,1944
Ninja Assassin,James Marshall,1967
V for Vendetta,James Marshall,1967
When Harry Met Sally,Rob Reiner,1947
Stand By Me,Rob Reiner,1947
A Few Good Men,Rob Reiner,1947
Top Gun,Tony Scott,1944
Jerry Maguire,Cameron Crowe,1957
As Good as It Gets,James L. Brooks,1940
RescueDawn,Werner Herzog,1942
What Dreams May Come,Vincent Ward,1956
Snow Falling on Cedars,Scott Hicks,1953
That Thing You Do,Tom Hanks,1956
Sleepless in Seattle,Nora Ephron,1941
You've Got Mail,Nora Ephron,1941
Joe Versus the Volcano,John Patrick Stanley,1950
The Replacements,Howard Deutch,1950
Charlie Wilson's War,Mike Nichols,1931
The Birdcage,Mike Nichols,1931
Unforgiven,Clint Eastwood,1930
Johnny Mnemonic,Robert Longo,1953
Cloud Atlas,Tom Tykwer,1965
Apollo 13,Ron Howard,1954
Frost/Nixon,Ron Howard,1954
The Da Vinci Code,Ron Howard,1954
The Green Mile,Frank Darabont,1959
Hoffa,Danny DeVito,1944
Twister,Jan de Bont,1943
The Polar Express,Robert Zemeckis,1951
Cast Away,Robert Zemeckis,1951
One Flew Over the Cuckoo's Nest,Milos Forman,1932
Something's Gotta Give,Nancy Meyers,1949
Bicentennial Man,Chris Columbus,1958
A League of Their Own,Penny Marshall,1943

Prerequisites
The example uses the Linux or macOS tarball installation. It assumes that your current work directory is
the <neo4j-home> directory of the tarball installation, and the CSV files are placed in the default import
directory.

1012
• For the default directory of other installations see, Operations Manual → File locations.

 • The import location can be configured with Operations Manual →


server.directories.import.

Importing the data


Import the movies.csv file

LOAD CSV WITH HEADERS FROM 'file:///movies.csv' AS line


MERGE (m:Movie {title: line.title})
ON CREATE SET
m.released = toInteger(line.released),
m.tagline = line.tagline

Result

Added 38 nodes, Set 114 properties, Added 38 labels

Import the actors.csv file

LOAD CSV WITH HEADERS FROM 'file:///actors.csv' AS line


MATCH (m:Movie {title: line.title})
MERGE (p:Person {name: line.name})
ON CREATE SET p.born = toInteger(line.born)
MERGE (p)-[:ACTED_IN {roles:split(line.roles, ';')}]->(m)

Result

Added 102 nodes, Created 172 relationships, Set 375 properties, Added 102 labels

Import the directors.csv file

LOAD CSV WITH HEADERS FROM 'file:///directors.csv' AS line


MATCH (m:Movie {title: line.title})
MERGE (p:Person {name: line.name})
ON CREATE SET p.born = toInteger(line.born)
MERGE (p)-[:DIRECTED]->(m)

Result

Added 23 nodes, Created 44 relationships, Set 46 properties, Added 23 labels

Profile query
Let’s say you want to write a query to find 'Tom Hanks'.

The naive way of doing this would be to write the following:

MATCH (p {name: 'Tom Hanks'})


RETURN p

Result

1013
p

(:Person {name: "Tom Hanks", born: 1956})

Rows: 1

This query will find the 'Tom Hanks' node but as the number of nodes in the database increase it will
become slower and slower. We can profile the query to find out why that is.

You can learn more about the options for profiling queries in Query tuning but in this case you are going to
prefix our query with PROFILE:

PROFILE
MATCH (p {name: 'Tom Hanks'})
RETURN p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page
Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | p | 8 | 1 | 3 | |
| | |
| | +----+------------------------+----------------+------+---------+----------------+
| | |
| +Filter | 1 | p.name = $autostring_0 | 8 | 1 | 125 | |
| | |
| | +----+------------------------+----------------+------+---------+----------------+
| | |
| +AllNodesScan | 2 | p | 163 | 163 | 164 | 120 |
4/0 | 0.860 | Fused in Pipeline 0 |
+-----------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 292, total allocated memory: 184

1 row
ready to start consuming query after 17 ms, results consumed after another 10 ms

The first thing to keep in mind when reading execution plans is that you need to read from the bottom up.

In that vein, starting from the last row, the first thing you notice is that the value in the Rows column seems
high given there is only one node with the name property 'Tom Hanks' in the database. If you look across
to the Operator column, you will see that AllNodesScan has been used which means that the query
planner scanned through all the nodes in the database.

The Filter operator which will check the name property on each of the nodes passed through by
AllNodesScan.

This seems like an inefficient way of finding 'Tom Hanks' given that you are looking at many nodes that
are not even people and therefore are not what you are looking for.

1014
The solution to this problem is that whenever you are looking for a node you should specify a label to help
the query planner narrow down the search space.

For this query you need to add a Person label.

MATCH (p:Person {name: 'Tom Hanks'})


RETURN p

This query will be faster than the first one, but as the number of people in your database increase you may
notice that the query slows down.

Again you can profile the query to work out why:

PROFILE
MATCH (p:Person {name: 'Tom Hanks'})
RETURN p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+------------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page
Cache Hits/Misses | Time (ms) | Pipeline |
+------------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | p | 6 | 1 | 3 | |
| | |
| | +----+------------------------+----------------+------+---------+----------------+
| | |
| +Filter | 1 | p.name = $autostring_0 | 6 | 1 | 250 | |
| | |
| | +----+------------------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | 2 | p:Person | 125 | 125 | 126 | 120 |
3/0 | 0.772 | Fused in Pipeline 0 |
+------------------+----+------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 379, total allocated memory: 184

1 row

This time the Rows value on the last row has reduced so you are not scanning some nodes that you were
before which is a good start. The NodeByLabelScan operator indicates that you achieved this by first doing
a linear scan of all the Person nodes in the database.

Once you have done that, you can again scan through all those nodes using the Filter operator,
comparing the name property of each one.

This might be acceptable in some cases but if you are going to be looking up people by name frequently
then you will see better performance if you create an index on the name property for the Person label:

1015
CREATE INDEX FOR (p:Person)
ON (p.name)

Result

Added 1 indexes

CALL db.awaitIndexes

Now if you run the query again it will run more quickly:

MATCH (p:Person {name: 'Tom Hanks'})


RETURN p

A profile for the query to see why that is:

PROFILE
MATCH (p:Person {name: 'Tom Hanks'})
RETURN p

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------+----+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows |
DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-----------------+----+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | p | 1 | 1 |
3 | | | | |
| | +----+-------------------------------------------------------+----------------+------
+---------+----------------+ | | |
| +NodeIndexSeek | 1 | RANGE INDEX p:Person(name) WHERE name = $autostring_0 | 1 | 1 |
2 | 120 | 2/1 | 0.688 | Fused in Pipeline 0 |
+-----------------+----+-------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+---------------------+

Total database accesses: 5, total allocated memory: 184

1 row

Our execution plan is down to a single row and uses the Node Index Seek operator which does an index
seek (see Create, show, and delete indexes) to find the appropriate node.

Advanced query tuning example


This page describes advanced query optimizations based on native index capabilities.

One of the most important and useful ways of optimizing Cypher queries involves creating appropriate
indexes. This is described in more detail in Create, show, and delete indexes, and demonstrated in Basic

1016
query tuning example. In summary, an index will be based on the combination of a Label and a property.
Any Cypher query that searches for nodes with a specific label and some predicate on the property
(equality, range or existence) will be planned to use the index if the cost planner deems that to be the most
efficient solution.

In order to benefit from enhancements provided by native indexes, it is useful to understand when index-
backed property lookup and index-backed ORDER BY will come into play. Let’s explain how to use these
features with a more advanced query tuning example.

If you are upgrading an existing store, it may be necessary to drop and re-create existing

 indexes. For information on native index support and upgrade considerations regarding
indexes, see Operations Manual → Performance → Index configuration.

The data set


In this section, examples demonstrates the impact native indexes can have on query performance under
certain conditions. You will use a movies dataset to illustrate this more advanced query tuning.

In this tutorial, you import data from the following CSV files:

• movies.csv

• actors.csv

• directors.csv

Movies

The movies.csv file contains two columns title, released and tagline.

The content of the movies.csv file:

movies.csv

title,released,tagline
Something's Gotta Give,1975,null
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
The Devil's Advocate,1997,Evil has its winning ways
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
V for Vendetta,2006,Freedom! Forever!
Cloud Atlas,2012,Everything is connected
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
Speed Racer,2008,Speed has no limits
Cloud Atlas,2012,Everything is connected
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
Ninja Assassin,2009,Prepare to enter a secret world of assassins

1017
V for Vendetta,2006,Freedom! Forever!
Speed Racer,2008,Speed has no limits
V for Vendetta,2006,Freedom! Forever!
Speed Racer,2008,Speed has no limits
Cloud Atlas,2012,Everything is connected
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
Ninja Assassin,2009,Prepare to enter a secret world of assassins
V for Vendetta,2006,Freedom! Forever!
Speed Racer,2008,Speed has no limits
V for Vendetta,2006,Freedom! Forever!
Ninja Assassin,2009,Prepare to enter a secret world of assassins
Speed Racer,2008,Speed has no limits
V for Vendetta,2006,Freedom! Forever!
The Matrix Revolutions,2003,Everything that has a beginning has an end
The Matrix Reloaded,2003,Free your mind
The Matrix,1999,Welcome to the Real World
The Matrix,1999,Welcome to the Real World
That Thing You Do,1996,In every life there comes a time when that thing you dream becomes that thing you
do
The Devil's Advocate,1997,Evil has its winning ways
The Devil's Advocate,1997,Evil has its winning ways
The Devil's Advocate,1997,Evil has its winning ways
Jerry Maguire,2000,The rest of his life begins now.
Top Gun,1986,"I feel the need, the need for speed."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Something's Gotta Give,1975,null
One Flew Over the Cuckoo's Nest,1975,"If he's crazy, what does that make you?"
Hoffa,1992,He didn't want law. He wanted justice.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Apollo 13,1995,"Houston, we have a problem."
Frost/Nixon,2008,400 million people were waiting for the truth.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
What Dreams May Come,1998,After life there is more. The end is just the beginning.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
Jerry Maguire,2000,The rest of his life begins now.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Hoffa,1992,He didn't want law. He wanted justice.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Ninja Assassin,2009,Prepare to enter a secret world of assassins
V for Vendetta,2006,Freedom! Forever!
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
When Harry Met Sally,1998,At odds in life... in love on-line.
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
When Harry Met Sally,1998,At odds in life... in love on-line.
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
A Few Good Men,1992,"In the heart of the nation's capital, in a courthouse of the U.S. government, one man
will stop at nothing to keep his honor, and one will stop at nothing to find the truth."
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
When Harry Met Sally,1998,At odds in life... in love on-line.

1018
Joe Versus the Volcano,1990,"A story of love, lava and burning desire."
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
You've Got Mail,1998,At odds in life... in love on-line.
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
Top Gun,1986,"I feel the need, the need for speed."
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
The Green Mile,1999,Walk a mile you'll never forget.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Jerry Maguire,2000,The rest of his life begins now.
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
Stand By Me,1995,"For some, it's the last real taste of innocence, and the first real taste of life. But
for everyone, it's the time that memories are made of."
Cast Away,2000,"At the edge of the world, his journey begins."
Twister,1996,Don't Breathe. Don't Look Back.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
You've Got Mail,1998,At odds in life... in love on-line.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
As Good as It Gets,1997,A comedy from the heart that goes for the throat.
What Dreams May Come,1998,After life there is more. The end is just the beginning.
Snow Falling on Cedars,1999,First loves last. Forever.
What Dreams May Come,1998,After life there is more. The end is just the beginning.
What Dreams May Come,1998,After life there is more. The end is just the beginning.
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
Bicentennial Man,1999,One robot's 200 year journey to become an ordinary man.
The Birdcage,1996,Come as you are
What Dreams May Come,1998,After life there is more. The end is just the beginning.
What Dreams May Come,1998,After life there is more. The end is just the beginning.
Snow Falling on Cedars,1999,First loves last. Forever.
Ninja Assassin,2009,Prepare to enter a secret world of assassins
Snow Falling on Cedars,1999,First loves last. Forever.
The Green Mile,1999,Walk a mile you'll never forget.
Snow Falling on Cedars,1999,First loves last. Forever.
Snow Falling on Cedars,1999,First loves last. Forever.
You've Got Mail,1998,At odds in life... in love on-line.
You've Got Mail,1998,At odds in life... in love on-line.
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
You've Got Mail,1998,At odds in life... in love on-line.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
The Polar Express,2004,This Holiday Season… Believe
Charlie Wilson's War,2007,A stiff drink. A little mascara. A lot of nerve. Who said they couldn't bring
down the Soviet empire.
Cast Away,2000,"At the edge of the world, his journey begins."
Apollo 13,1995,"Houston, we have a problem."
The Green Mile,1999,Walk a mile you'll never forget.
The Da Vinci Code,2006,Break The Codes
Cloud Atlas,2012,Everything is connected
That Thing You Do,1996,In every life there comes a time when that thing you dream becomes that thing you
do
Joe Versus the Volcano,1990,"A story of love, lava and burning desire."
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
You've Got Mail,1998,At odds in life... in love on-line.
That Thing You Do,1996,In every life there comes a time when that thing you dream becomes that thing you
do
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
You've Got Mail,1998,At odds in life... in love on-line.
When Harry Met Sally,1998,At odds in life... in love on-line.

1019
When Harry Met Sally,1998,At odds in life... in love on-line.
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
Sleepless in Seattle,1993,"What if someone you never met, someone you never saw, someone you never knew
was the only someone for you?"
Joe Versus the Volcano,1990,"A story of love, lava and burning desire."
The Birdcage,1996,Come as you are
Joe Versus the Volcano,1990,"A story of love, lava and burning desire."
When Harry Met Sally,1998,At odds in life... in love on-line.
When Harry Met Sally,1998,At odds in life... in love on-line.
When Harry Met Sally,1998,At odds in life... in love on-line.
That Thing You Do,1996,In every life there comes a time when that thing you dream becomes that thing you
do
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
Unforgiven,1992,"It's a hell of a thing, killing a man"
The Birdcage,1996,Come as you are
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
Twister,1996,Don't Breathe. Don't Look Back.
RescueDawn,2006,Based on the extraordinary true story of one man's fight for freedom
Charlie Wilson's War,2007,A stiff drink. A little mascara. A lot of nerve. Who said they couldn't bring
down the Soviet empire.
The Birdcage,1996,Come as you are
Unforgiven,1992,"It's a hell of a thing, killing a man"
Unforgiven,1992,"It's a hell of a thing, killing a man"
Unforgiven,1992,"It's a hell of a thing, killing a man"
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
Johnny Mnemonic,1995,The hottest data on earth. In the coolest head in town
Cloud Atlas,2012,Everything is connected
Cloud Atlas,2012,Everything is connected
Cloud Atlas,2012,Everything is connected
The Da Vinci Code,2006,Break The Codes
The Da Vinci Code,2006,Break The Codes
The Da Vinci Code,2006,Break The Codes
Apollo 13,1995,"Houston, we have a problem."
Frost/Nixon,2008,400 million people were waiting for the truth.
The Da Vinci Code,2006,Break The Codes
V for Vendetta,2006,Freedom! Forever!
V for Vendetta,2006,Freedom! Forever!
V for Vendetta,2006,Freedom! Forever!
Ninja Assassin,2009,Prepare to enter a secret world of assassins
Speed Racer,2008,Speed has no limits
V for Vendetta,2006,Freedom! Forever!
Speed Racer,2008,Speed has no limits
Speed Racer,2008,Speed has no limits
Speed Racer,2008,Speed has no limits
Speed Racer,2008,Speed has no limits
Speed Racer,2008,Speed has no limits
Ninja Assassin,2009,Prepare to enter a secret world of assassins
Speed Racer,2008,Speed has no limits
Ninja Assassin,2009,Prepare to enter a secret world of assassins
The Green Mile,1999,Walk a mile you'll never forget.
The Green Mile,1999,Walk a mile you'll never forget.
Frost/Nixon,2008,400 million people were waiting for the truth.
The Green Mile,1999,Walk a mile you'll never forget.
Apollo 13,1995,"Houston, we have a problem."
The Green Mile,1999,Walk a mile you'll never forget.
The Green Mile,1999,Walk a mile you'll never forget.
The Green Mile,1999,Walk a mile you'll never forget.
Frost/Nixon,2008,400 million people were waiting for the truth.
Frost/Nixon,2008,400 million people were waiting for the truth.
Bicentennial Man,1999,One robot's 200 year journey to become an ordinary man.
Frost/Nixon,2008,400 million people were waiting for the truth.
One Flew Over the Cuckoo's Nest,1975,"If he's crazy, what does that make you?"
Hoffa,1992,He didn't want law. He wanted justice.
Hoffa,1992,He didn't want law. He wanted justice.
Hoffa,1992,He didn't want law. He wanted justice.
Apollo 13,1995,"Houston, we have a problem."

1020
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
Twister,1996,Don't Breathe. Don't Look Back.
Apollo 13,1995,"Houston, we have a problem."
Charlie Wilson's War,2007,A stiff drink. A little mascara. A lot of nerve. Who said they couldn't bring
down the Soviet empire.
Twister,1996,Don't Breathe. Don't Look Back.
Twister,1996,Don't Breathe. Don't Look Back.
The Polar Express,2004,This Holiday Season… Believe
Cast Away,2000,"At the edge of the world, his journey begins."
One Flew Over the Cuckoo's Nest,1975,"If he's crazy, what does that make you?"
Something's Gotta Give,1975,null
Something's Gotta Give,1975,null
Something's Gotta Give,1975,null
Something's Gotta Give,1975,null
Bicentennial Man,1999,One robot's 200 year journey to become an ordinary man.
Charlie Wilson's War,2007,A stiff drink. A little mascara. A lot of nerve. Who said they couldn't bring
down the Soviet empire.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
A League of Their Own,1992,Once in a lifetime you get a chance to do something different.
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
The Da Vinci Code,2006,Break The Codes
The Birdcage,1996,Come as you are
Unforgiven,1992,"It's a hell of a thing, killing a man"
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"
Cloud Atlas,2012,Everything is connected
The Da Vinci Code,2006,Break The Codes
The Replacements,2000,"Pain heals, Chicks dig scars... Glory lasts forever"

Actors

The actors.csv file contains two columns title, roles, name, and born.

The content of the actors.csv file:

actors.csv

title,roles,name,born
Something's Gotta Give,Julian Mercer,Keanu Reeves,1964
Johnny Mnemonic,Johnny Mnemonic,Keanu Reeves,1964
The Replacements,Shane Falco,Keanu Reeves,1964
The Devil's Advocate,Kevin Lomax,Keanu Reeves,1964
The Matrix Revolutions,Neo,Keanu Reeves,1964
The Matrix Reloaded,Neo,Keanu Reeves,1964
The Matrix,Neo,Keanu Reeves,1964
The Matrix Revolutions,Trinity,Carrie-Anne Moss,1967
The Matrix Reloaded,Trinity,Carrie-Anne Moss,1967
The Matrix,Trinity,Carrie-Anne Moss,1967
The Matrix Revolutions,Morpheus,Laurence Fishburne,1961
The Matrix Reloaded,Morpheus,Laurence Fishburne,1961
The Matrix,Morpheus,Laurence Fishburne,1961
V for Vendetta,V,Hugo Weaving,1960
Cloud Atlas,Bill Smoke;Haskell Moore;Tadeusz Kesselring;Nurse Noakes;Boardman Mephi;Old Georgie,Hugo
Weaving,1960
The Matrix Revolutions,Agent Smith,Hugo Weaving,1960
The Matrix Reloaded,Agent Smith,Hugo Weaving,1960
The Matrix,Agent Smith,Hugo Weaving,1960
The Matrix,Emil,Emil Eifrem,1978
That Thing You Do,Tina,Charlize Theron,1975
The Devil's Advocate,Mary Ann Lomax,Charlize Theron,1975
The Devil's Advocate,John Milton,Al Pacino,1940
Jerry Maguire,Jerry Maguire,Tom Cruise,1962
Top Gun,Maverick,Tom Cruise,1962
A Few Good Men,Lt. Daniel Kaffee,Tom Cruise,1962
Something's Gotta Give,Harry Sanborn,Jack Nicholson,1937
One Flew Over the Cuckoo's Nest,Randle McMurphy,Jack Nicholson,1937
Hoffa,Hoffa,Jack Nicholson,1937
As Good as It Gets,Melvin Udall,Jack Nicholson,1937
A Few Good Men,Col. Nathan R. Jessup,Jack Nicholson,1937

1021
A Few Good Men,Lt. Cdr. JoAnne Galloway,Demi Moore,1962
Apollo 13,Jack Swigert,Kevin Bacon,1958
Frost/Nixon,Jack Brennan,Kevin Bacon,1958
A Few Good Men,Capt. Jack Ross,Kevin Bacon,1958
Stand By Me,Ace Merrill,Kiefer Sutherland,1966
A Few Good Men,Lt. Jonathan Kendrick,Kiefer Sutherland,1966
A Few Good Men,Cpl. Jeffrey Barnes,Noah Wyle,1971
What Dreams May Come,Albert Lewis,Cuba Gooding Jr.,1968
As Good as It Gets,Frank Sachs,Cuba Gooding Jr.,1968
Jerry Maguire,Rod Tidwell,Cuba Gooding Jr.,1968
A Few Good Men,Cpl. Carl Hammaker,Cuba Gooding Jr.,1968
A Few Good Men,Lt. Sam Weinberg,Kevin Pollak,1957
Hoffa,Frank Fitzsimmons,J.T. Walsh,1943
A Few Good Men,Lt. Col. Matthew Andrew Markinson,J.T. Walsh,1943
A Few Good Men,Pfc. Louden Downey,James Marshall,1967
A Few Good Men,Dr. Stone,Christopher Guest,1948
A Few Good Men,Man in Bar,Aaron Sorkin,1961
Top Gun,Charlie,Kelly McGillis,1957
Top Gun,Iceman,Val Kilmer,1959
Top Gun,Goose,Anthony Edwards,1962
Top Gun,Viper,Tom Skerritt,1933
When Harry Met Sally,Sally Albright,Meg Ryan,1961
Joe Versus the Volcano,DeDe;Angelica Graynamore;Patricia Graynamore,Meg Ryan,1961
Sleepless in Seattle,Annie Reed,Meg Ryan,1961
You've Got Mail,Kathleen Kelly,Meg Ryan,1961
Top Gun,Carole,Meg Ryan,1961
Jerry Maguire,Dorothy Boyd,Renee Zellweger,1969
Jerry Maguire,Avery Bishop,Kelly Preston,1962
Stand By Me,Vern Tessio,Jerry O'Connell,1974
Jerry Maguire,Frank Cushman,Jerry O'Connell,1974
Jerry Maguire,Bob Sugar,Jay Mohr,1970
The Green Mile,Jan Edgecomb,Bonnie Hunt,1961
Jerry Maguire,Laurel Boyd,Bonnie Hunt,1961
Jerry Maguire,Marcee Tidwell,Regina King,1971
Jerry Maguire,Ray Boyd,Jonathan Lipnicki,1990
Stand By Me,Chris Chambers,River Phoenix,1970
Stand By Me,Teddy Duchamp,Corey Feldman,1971
Stand By Me,Gordie Lachance,Wil Wheaton,1972
Stand By Me,Denny Lachance,John Cusack,1966
RescueDawn,Admiral,Marshall Bell,1942
Stand By Me,Mr. Lachance,Marshall Bell,1942
Cast Away,Kelly Frears,Helen Hunt,1963
Twister,Dr. Jo Harding,Helen Hunt,1963
As Good as It Gets,Carol Connelly,Helen Hunt,1963
You've Got Mail,Frank Navasky,Greg Kinnear,1963
As Good as It Gets,Simon Bishop,Greg Kinnear,1963
What Dreams May Come,Simon Bishop,Annabella Sciorra,1960
Snow Falling on Cedars,Nels Gudmundsson,Max von Sydow,1929
What Dreams May Come,The Tracker,Max von Sydow,1929
What Dreams May Come,The Face,Werner Herzog,1942
Bicentennial Man,Andrew Marin,Robin Williams,1951
The Birdcage,Armand Goldman,Robin Williams,1951
What Dreams May Come,Chris Nielsen,Robin Williams,1951
Snow Falling on Cedars,Ishmael Chambers,Ethan Hawke,1970
Ninja Assassin,Takeshi,Rick Yune,1971
Snow Falling on Cedars,Kazuo Miyamoto,Rick Yune,1971
The Green Mile,Warden Hal Moores,James Cromwell,1940
Snow Falling on Cedars,Judge Fielding,James Cromwell,1940
You've Got Mail,Patricia Eden,Parker Posey,1968
You've Got Mail,Kevin Jackson,Dave Chappelle,1973
RescueDawn,Duane,Steve Zahn,1967
You've Got Mail,George Pappas,Steve Zahn,1967
A League of Their Own,Jimmy Dugan,Tom Hanks,1956
The Polar Express,Hero Boy;Father;Conductor;Hobo;Scrooge;Santa Claus,Tom Hanks,1956
Charlie Wilson's War,Rep. Charlie Wilson,Tom Hanks,1956
Cast Away,Chuck Noland,Tom Hanks,1956
Apollo 13,Jim Lovell,Tom Hanks,1956
The Green Mile,Paul Edgecomb,Tom Hanks,1956
The Da Vinci Code,Dr. Robert Langdon,Tom Hanks,1956
Cloud Atlas,Zachry;Dr. Henry Goose;Isaac Sachs;Dermot Hoggins,Tom Hanks,1956
That Thing You Do,Mr. White,Tom Hanks,1956
Joe Versus the Volcano,Joe Banks,Tom Hanks,1956
Sleepless in Seattle,Sam Baldwin,Tom Hanks,1956
You've Got Mail,Joe Fox,Tom Hanks,1956
Sleepless in Seattle,Suzy,Rita Wilson,1956
Sleepless in Seattle,Walter,Bill Pullman,1953
Sleepless in Seattle,Greg,Victor Garber,1949

1022
A League of Their Own,Doris Murphy,Rosie O'Donnell,1962
Sleepless in Seattle,Becky,Rosie O'Donnell,1962
The Birdcage,Albert Goldman,Nathan Lane,1956
Joe Versus the Volcano,Baw,Nathan Lane,1956
When Harry Met Sally,Harry Burns,Billy Crystal,1948
When Harry Met Sally,Marie,Carrie Fisher,1956
When Harry Met Sally,Jess,Bruno Kirby,1949
That Thing You Do,Faye Dolan,Liv Tyler,1977
The Replacements,Annabelle Farrell,Brooke Langton,1970
Unforgiven,Little Bill Daggett,Gene Hackman,1930
The Birdcage,Sen. Kevin Keeley,Gene Hackman,1930
The Replacements,Jimmy McGinty,Gene Hackman,1930
The Replacements,Clifford Franklin,Orlando Jones,1968
RescueDawn,Dieter Dengler,Christian Bale,1974
Twister,Eddie,Zach Grenier,1954
RescueDawn,Squad Leader,Zach Grenier,1954
Unforgiven,English Bob,Richard Harris,1930
Unforgiven,Bill Munny,Clint Eastwood,1930
Johnny Mnemonic,Takahashi,Takeshi Kitano,1947
Johnny Mnemonic,Jane,Dina Meyer,1968
Johnny Mnemonic,J-Bone,Ice-T,1958
Cloud Atlas,Luisa Rey;Jocasta Ayrs;Ovid;Meronym,Halle Berry,1966
Cloud Atlas,Vyvyan Ayrs;Captain Molyneux;Timothy Cavendish,Jim Broadbent,1949
The Da Vinci Code,Sir Leight Teabing,Ian McKellen,1939
The Da Vinci Code,Sophie Neveu,Audrey Tautou,1976
The Da Vinci Code,Silas,Paul Bettany,1971
V for Vendetta,Evey Hammond,Natalie Portman,1981
V for Vendetta,Eric Finch,Stephen Rea,1946
V for Vendetta,High Chancellor Adam Sutler,John Hurt,1940
Ninja Assassin,Ryan Maslow,Ben Miles,1967
Speed Racer,Cass Jones,Ben Miles,1967
V for Vendetta,Dascomb,Ben Miles,1967
Speed Racer,Speed Racer,Emile Hirsch,1985
Speed Racer,Pops,John Goodman,1960
Speed Racer,Mom,Susan Sarandon,1946
Speed Racer,Racer X,Matthew Fox,1966
Speed Racer,Trixie,Christina Ricci,1980
Ninja Assassin,Raizo,Rain,1982
Speed Racer,Taejo Togokahn,Rain,1982
Ninja Assassin,Mika Coretti,Naomie Harris,null
The Green Mile,John Coffey,Michael Clarke Duncan,1957
The Green Mile,Brutus 'Brutal' Howell,David Morse,1953
Frost/Nixon,"James Reston, Jr.",Sam Rockwell,1968
The Green Mile,'Wild Bill' Wharton,Sam Rockwell,1968
Apollo 13,Ken Mattingly,Gary Sinise,1955
The Green Mile,Burt Hammersmith,Gary Sinise,1955
The Green Mile,Melinda Moores,Patricia Clarkson,1959
Frost/Nixon,Richard Nixon,Frank Langella,1938
Frost/Nixon,David Frost,Michael Sheen,1969
Bicentennial Man,Rupert Burns,Oliver Platt,1960
Frost/Nixon,Bob Zelnick,Oliver Platt,1960
One Flew Over the Cuckoo's Nest,Martini,Danny DeVito,1944
Hoffa,Robert 'Bobby' Ciaro,Danny DeVito,1944
Hoffa,Peter 'Pete' Connelly,John C. Reilly,1965
Apollo 13,Gene Kranz,Ed Harris,1950
A League of Their Own,Bob Hinson,Bill Paxton,1955
Twister,Bill Harding,Bill Paxton,1955
Apollo 13,Fred Haise,Bill Paxton,1955
Charlie Wilson's War,Gust Avrakotos,Philip Seymour Hoffman,1967
Twister,Dustin 'Dusty' Davis,Philip Seymour Hoffman,1967
Something's Gotta Give,Erica Barry,Diane Keaton,1946
Charlie Wilson's War,Joanne Herring,Julia Roberts,1967
A League of Their Own,'All the Way' Mae Mordabito,Madonna,1954
A League of Their Own,Dottie Hinson,Geena Davis,1956
A League of Their Own,Kit Keller,Lori Petty,1963

Directors

The directors.csv file contains two columns title, name, and born.

The content of the directors.csv file:

1023
directors.csv

title,name,born
Speed Racer,Andy Wachowski,1967
Cloud Atlas,Andy Wachowski,1967
The Matrix Revolutions,Andy Wachowski,1967
The Matrix Reloaded,Andy Wachowski,1967
The Matrix,Andy Wachowski,1967
Speed Racer,Lana Wachowski,1965
Cloud Atlas,Lana Wachowski,1965
The Matrix Revolutions,Lana Wachowski,1965
The Matrix Reloaded,Lana Wachowski,1965
The Matrix,Lana Wachowski,1965
The Devil's Advocate,Taylor Hackford,1944
Ninja Assassin,James Marshall,1967
V for Vendetta,James Marshall,1967
When Harry Met Sally,Rob Reiner,1947
Stand By Me,Rob Reiner,1947
A Few Good Men,Rob Reiner,1947
Top Gun,Tony Scott,1944
Jerry Maguire,Cameron Crowe,1957
As Good as It Gets,James L. Brooks,1940
RescueDawn,Werner Herzog,1942
What Dreams May Come,Vincent Ward,1956
Snow Falling on Cedars,Scott Hicks,1953
That Thing You Do,Tom Hanks,1956
Sleepless in Seattle,Nora Ephron,1941
You've Got Mail,Nora Ephron,1941
Joe Versus the Volcano,John Patrick Stanley,1950
The Replacements,Howard Deutch,1950
Charlie Wilson's War,Mike Nichols,1931
The Birdcage,Mike Nichols,1931
Unforgiven,Clint Eastwood,1930
Johnny Mnemonic,Robert Longo,1953
Cloud Atlas,Tom Tykwer,1965
Apollo 13,Ron Howard,1954
Frost/Nixon,Ron Howard,1954
The Da Vinci Code,Ron Howard,1954
The Green Mile,Frank Darabont,1959
Hoffa,Danny DeVito,1944
Twister,Jan de Bont,1943
The Polar Express,Robert Zemeckis,1951
Cast Away,Robert Zemeckis,1951
One Flew Over the Cuckoo's Nest,Milos Forman,1932
Something's Gotta Give,Nancy Meyers,1949
Bicentennial Man,Chris Columbus,1958
A League of Their Own,Penny Marshall,1943

Prerequisites
The example uses the Linux or macOS tarball installation. It assumes that your current work directory is
the <neo4j-home> directory of the tarball installation, and the CSV files are placed in the default import
directory.

• For the default directory of other installations see, Operations Manual → File locations.

 • The import location can be configured with Operations Manual →


server.directories.import.

Importing the data


Import the movies.csv file

1024
LOAD CSV WITH HEADERS FROM 'file:///movies.csv' AS line
MERGE (m:Movie {title: line.title})
ON CREATE SET
m.released = toInteger(line.released),
m.tagline = line.tagline

Result

Added 38 nodes, Set 114 properties, Added 38 labels

Import the actors.csv file

LOAD CSV WITH HEADERS FROM 'file:///actors.csv' AS line


MATCH (m:Movie {title: line.title})
MERGE (p:Person {name: line.name})
ON CREATE SET p.born = toInteger(line.born)
MERGE (p)-[:ACTED_IN {roles:split(line.roles, ';')}]->(m)

Result

Added 102 nodes, Created 172 relationships, Set 375 properties, Added 102 labels

Import the directors.csv file

LOAD CSV WITH HEADERS FROM 'file:///directors.csv' AS line


MATCH (m:Movie {title: line.title})
MERGE (p:Person {name: line.name})
ON CREATE SET p.born = toInteger(line.born)
MERGE (p)-[:DIRECTED]->(m)

Result

Added 23 nodes, Created 44 relationships, Set 46 properties, Added 23 labels

Create an index for nodes with the Person label

CREATE INDEX FOR (p:Person)


ON (p.name)

Result

Added 1 indexes

CALL db.awaitIndexes

Index-backed property-lookup
In this example you want to write a query to find persons with the name 'Tom' that acted in a movie.

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name STARTS WITH 'Tom'
RETURN
p.name AS name,
count(m) AS count

1025
Result

name count

"Tom Cruise" 3

"Tom Hanks" 12

"Tom Skerritt" 1

Rows:3

The query request the database to return all the actors with the first name 'Tom'. There are three of them:
'Tom Cruise', 'Tom Skerritt' and 'Tom Hanks'. With native indexes, however, you can leverage the fact that
indexes store the property values. In this case, it means that the names can be looked up directly from the
index. This allows Cypher to avoid the second call to the database to find the property, which can save
time on very large queries.

If we profile the above query, we see that the NodeIndexSeekByRange in the Details column contains
cache[p.name], which means that p.name is retrieved from the index. We can also see that the
OrderedAggregation has no DB Hits, which means it does not have to access the database again.

PROFILE
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name STARTS WITH 'Tom'
RETURN
p.name AS name,
count(m) AS count

1026
Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------+----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+------------+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by |
Pipeline |
+-----------------------+----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+------------+---------------------+
| +ProduceResults | 0 | name, count
| 1 | 3 | 0 | | 0/0 | 0.060 | |
|
| | +----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+ | |
| +OrderedAggregation | 1 | cache[p.name] AS name, count(m) AS count
| 1 | 3 | 0 | 1520 | 0/0 | 3.119 | name ASC | In
Pipeline 1 |
| | +----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+------------+---------------------+
| +Filter | 2 | m:Movie
| 1 | 16 | 32 | | | | |
|
| | +----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | | |
| +Expand(All) | 3 | (p)-[anon_0:ACTED_IN]->(m)
| 1 | 16 | 22 | | | | |
|
| | +----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | | |
| +NodeIndexSeekByRange | 4 | RANGE INDEX p:Person(name) WHERE name STARTS WITH $autostring_0,
cache[p.name] | 0 | 4 | 5 | 120 | 7/0 | 0.611 |
p.name ASC | Fused in Pipeline 0 |
+-----------------------+----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+------------+---------------------+

Total database accesses: 59, total allocated memory: 1600

3 rows

If we change the query, such that it can no longer use an index, we will see that there will be no
cache[p.name] in the Details column, and that the EagerAggregation now has DB Hits, since it accesses
the database again to retrieve the name.

PROFILE
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
RETURN
p.name AS name,
count(m) AS count

1027
Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------+----+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory
(Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +ProduceResults | 0 | name, count | 13 | 102 | 0 |
| 0/0 | 0.155 | In Pipeline 1 |
| | +----+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | p.name AS name, count(m) AS count | 13 | 102 | 344 |
17296 | | | |
| | +----+-----------------------------------+----------------+------+---------
+----------------+ | | |
| +Filter | 2 | p:Person | 172 | 172 | 344 |
| | | |
| | +----+-----------------------------------+----------------+------+---------
+----------------+ | | |
| +Expand(All) | 3 | (m)<-[anon_0:ACTED_IN]-(p) | 172 | 172 | 254 |
| | | |
| | +----+-----------------------------------+----------------+------+---------
+----------------+ | | |
| +NodeByLabelScan | 4 | m:Movie | 38 | 38 | 39 |
120 | 29/0 | 1.444 | Fused in Pipeline 0 |
+-------------------+----+-----------------------------------+----------------+------+---------
+----------------+------------------------+-----------+---------------------+

Total database accesses: 981, total allocated memory: 17376

102 rows

For non-native indexes there will still be a second database access to retrieve those values.

Predicates that can be used to enable this optimization are:

• Existence (e.g. WHERE n.name IS NOT NULL)

• Equality (e.g. WHERE n.name = 'Tom Hanks')

• Range (e.g. WHERE n.uid > 1000 AND n.uid < 2000)

• Prefix (e.g. WHERE n.name STARTS WITH 'Tom')

• Suffix (e.g. WHERE n.name ENDS WITH 'Hanks')

• Substring (e.g. WHERE n.name CONTAINS 'a')

• Several predicates of the above types combined using OR, given that all of them are on the same
property (e.g. WHERE n.prop < 10 OR n.prop = 'infinity')

If there is an existence constraint on the property, no predicate is required to trigger the

 optimization. For example, CREATE CONSTRAINT constraint_name FOR (p:Person)


REQUIRE p.name IS NOT NULL.

1028
Aggregating functions

For all built-in aggregating functions in Cypher, the index-backed property-lookup optimization can be
used even without a predicate.

Consider this query which returns the number of distinct names of people in the movies dataset:

PROFILE
MATCH (p:Person)
RETURN count(DISTINCT p.name) AS numberOfNames

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------+----+------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Id | Details | Estimated
Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | 0 | numberOfNames |
1 | 1 | 0 | | 0/0 | 0.026 | In Pipeline 1 |
| | +----+------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +EagerAggregation | 1 | count(DISTINCT cache[p.name]) AS numberOfNames |
1 | 1 | 0 | 9888 | | | |
| | +----+------------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +NodeIndexScan | 2 | RANGE INDEX p:Person(name) WHERE name IS NOT NULL, cache[p.name] |
125 | 125 | 126 | 120 | 1/0 | 1.400 | Fused in Pipeline 0 |
+-------------------+----+------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 126, total allocated memory: 9952

1 row

Note that the NodeIndexScan in the Details column contains cache[p.name] and that the EagerAggregation
has no DB Hits. In this case, the semantics of aggregating functions works like an implicit existence
predicate because Person nodes without the property name will not affect the result of an aggregation.

Index-backed ORDER BY
Now consider the following refinement to the query:

PROFILE
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name STARTS WITH 'Tom'
RETURN
p.name AS name,
count(m) AS count
ORDER BY name

1029
Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-----------------------+----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+------------+---------------------+
| Operator | Id | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by |
Pipeline |
+-----------------------+----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+------------+---------------------+
| +ProduceResults | 0 | name, count
| 1 | 3 | 0 | | 0/0 | 0.025 | |
|
| | +----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+ | |
| +OrderedAggregation | 1 | cache[p.name] AS name, count(m) AS count
| 1 | 3 | 0 | 1520 | 0/0 | 0.097 | name ASC | In
Pipeline 1 |
| | +----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+------------+---------------------+
| +Filter | 2 | m:Movie
| 1 | 16 | 32 | | | | |
|
| | +----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | | |
| +Expand(All) | 3 | (p)-[anon_0:ACTED_IN]->(m)
| 1 | 16 | 22 | | | | |
|
| | +----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+ | | | |
| +NodeIndexSeekByRange | 4 | RANGE INDEX p:Person(name) WHERE name STARTS WITH $autostring_0,
cache[p.name] | 0 | 4 | 5 | 120 | 7/0 | 0.406 |
p.name ASC | Fused in Pipeline 0 |
+-----------------------+----
+--------------------------------------------------------------------------------+----------------+------
+---------+----------------+------------------------+-----------+------------+---------------------+

We are asking for the results in ascending alphabetical order. The native index happens to store String
properties in ascending alphabetical order, and Cypher knows this. In Neo4j 3.5 and later, the Cypher
planner will recognize that the index already returns data in the correct order, and skip the Sort operation.

The Order by column describes the order of rows after each operator. We see that the Order by column
contains p.name ASC from the index seek operation, meaning that the rows are ordered by p.name in
ascending order.

Index-backed ORDER BY can also be used for queries that expect their results is descending order, but
with slightly lower performance.

In cases where the Cypher planner is unable to remove the Sort operator, the planner

 can utilize knowledge of the ORDER BY clause to plan the Sort operator at a point in the
plan with optimal cardinality.

1030
min() and max()

For the min and max functions, the index-backed ORDER BY optimization can be used to avoid aggregation
and instead utilize the fact that the minimum/maximum value is the first/last one in a sorted index.
Consider the following query which returns the fist actor in alphabetical order:

PROFILE
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
RETURN min(p.name) AS name

Query Plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+-------------------+----+----------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) |
Page Cache Hits/Misses | Time (ms) | Pipeline |
+-------------------+----+----------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +ProduceResults | 0 | name | 1 | 1 | 0 | |
0/0 | 0.027 | In Pipeline 1 |
| | +----+----------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+
| +EagerAggregation | 1 | min(p.name) AS name | 1 | 1 | 344 | 32 |
| | |
| | +----+----------------------------+----------------+------+---------+----------------+
| | |
| +Filter | 2 | p:Person | 172 | 172 | 344 | |
| | |
| | +----+----------------------------+----------------+------+---------+----------------+
| | |
| +Expand(All) | 3 | (m)<-[anon_0:ACTED_IN]-(p) | 172 | 172 | 254 | |
| | |
| | +----+----------------------------+----------------+------+---------+----------------+
| | |
| +NodeByLabelScan | 4 | m:Movie | 38 | 38 | 39 | 120 |
29/0 | 0.990 | Fused in Pipeline 0 |
+-------------------+----+----------------------------+----------------+------+---------+----------------
+------------------------+-----------+---------------------+

Total database accesses: 981, total allocated memory: 232

1 row

Aggregations are usually using the EagerAggregation operation. This would mean scanning all nodes in
the index to find the name that is first in alphabetic order. Instead, the query is planned with Projection,
followed by Limit, followed by Optional. This will simply pick the first value from the index.

For large datasets, this can improve performance dramatically.

Index-backed ORDER BY can also be used for corresponding queries with the max function, but with
slightly lower performance.

1031
Restrictions

The optimization can only work on native indexes. It does not work for predicates only querying for the
spatial type POINT.

Predicates that can be used to enable this optimization are:

• Existence (e.g.WHERE n.name IS NOT NULL)

• Equality (e.g. WHERE n.name = 'Tom Hanks')

• Range (e.g. WHERE n.uid > 1000 AND n.uid < 2000)

• Prefix (e.g. WHERE n.name STARTS WITH 'Tom')

• Suffix (e.g. WHERE n.name ENDS WITH 'Hanks')

• Substring (e.g. WHERE n.name CONTAINS 'a')

Predicates that will not work:

• Several predicates combined using OR

• Equality or range predicates querying for points (e.g. WHERE n.place > point({ x: 1, y: 2 }))

• Spatial distance predicates (e.g. WHERE point.distance(n.place, point({ x: 1, y: 2 })) < 2)

If there is a property existence constraint on the property, no predicate is required to


trigger the optimization. For example, CREATE CONSTRAINT constraint_name FOR
(p:Person) REQUIRE p.name IS NOT NULL

Predicates with parameters, such as WHERE n.prop > $param, can trigger index-backed
ORDER BY. The only exception are queries with parameters of type POINT.

Shortest path planning


This page contains an example of how to plan queries using the shortestPath() function.

Planning shortest paths in Cypher can lead to different query plans depending on the predicates that need
to be evaluated. Internally, Neo4j will use a fast bidirectional breadth-first search algorithm if the
predicates can be evaluated whilst searching for the path. Therefore, this fast algorithm will always be
certain to return the right answer when there are universal predicates on the path; for example, when
searching for the shortest path where all nodes have the Person label, or where there are no nodes with a
name property.

If the predicates need to inspect the whole path before deciding on whether it is valid or not, this fast
algorithm cannot be relied on to find the shortest path, and Neo4j may have to resort to using a slower
exhaustive depth-first search algorithm to find the path. This means that query plans for shortest path
queries with non-universal predicates will include a fallback to running the exhaustive search to find the
path should the fast algorithm not succeed. For example, depending on the data, an answer to a shortest
path query with existential predicates — such as the requirement that at least one node contains the
property name='Kevin Bacon' — may not be able to be found by the fast algorithm. In this case, Neo4j will
fall back to using the exhaustive search to enumerate all paths and potentially return an answer.

1032
The running times of these two algorithms may differ by orders of magnitude, so it is important to ensure
that the fast approach is used for time-critical queries.

When the exhaustive search is planned, it is still only executed when the fast algorithm fails to find any
matching paths. The fast algorithm is always executed first, since it is possible that it can find a valid path
even though that could not be guaranteed at planning time.

Please note that falling back to the exhaustive search may prove to be a very time consuming strategy in
some cases; such as when there is no shortest path between two nodes. Therefore, in these cases, it is
recommended to set cypher.forbid_exhaustive_shortestpath to true, as explained in Operations Manual
→ Configuration settings.

Shortest path — fast algorithm

1033
Example 502. Query evaluated with the fast algorith

This query can be evaluated with the fast algorithm — there are no predicates that need to see the
whole path before being evaluated.

Query

PROFILE
MATCH
(KevinB:Person {name: 'Kevin Bacon'}),
(Al:Person {name: 'Al Pacino'}),
p = shortestPath((KevinB)-[:ACTED_IN*]-(Al))
WHERE all(r IN relationships(p) WHERE r.role IS NOT NULL)
RETURN p

Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| Operator | Details
| Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline
|
+---------------------
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +ProduceResults | p
| 2 | 1 | 0 | | 1/0 | 0.252 |
|
| |
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------+
|
| +ShortestPath | p = (KevinB)-[anon_0:ACTED_IN*]-(Al) WHERE all(r IN relationships(p) WHERE
r.role IS NOT NULL) | 2 | 1 | 23 | 1688 | |
| In Pipeline 1 |
| |
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+
| +MultiNodeIndexSeek | RANGE INDEX KevinB:Person(name) WHERE name = $autostring_0,
| 2 | 1 | 4 | 120 | 1/1 | 0.916 | In Pipeline
0 |
| | RANGE INDEX Al:Person(name) WHERE name = $autostring_1
| | | | | | |
|
+---------------------
+------------------------------------------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------+

Total database accesses: 27, total allocated memory: 1752

Shortest path — additional predicate checks on the paths


Predicates used in the WHERE clause that apply to the shortest path pattern are evaluated before deciding
what the shortest matching path is.

1034
Example 503. Consider using the exhaustive search as a fallback

Query

MATCH
(KevinB:Person {name: 'Kevin Bacon'}),
(Al:Person {name: 'Al Pacino'}),
p = shortestPath((KevinB)-[*]-(Al))
WHERE length(p) > 1
RETURN p

This query, in contrast with the one above, needs to check that the whole path follows the predicate
before we know if it is valid or not, and so the query plan will also include the fallback to the slower
exhaustive search algorithm.

Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 1024

+--------------------------+-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| Operator | Details | Estimated
Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+--------------------------+-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +ProduceResults | p |
1 | 1 | 0 | | | | |
| | +-------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| +AntiConditionalApply | |
1 | 1 | 0 | 41464 | 0/0 | 0.332 | Fused in Pipeline 6 |
| |\ +-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| | +Top | anon_1 ASC LIMIT 1 |
2 | 0 | 0 | 4280 | 0/0 | 0.000 | In Pipeline 5 |
| | | +-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| | +Projection | length(p) AS anon_1 |
7966 | 0 | 0 | | | | |
| | | +-------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| | +Filter | length(p) > $autoint_2 |
7966 | 0 | 0 | | | | |
| | | +-------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| | +Projection | (KevinB)-[anon_0*]-(Al) AS p |
26554 | 0 | 0 | | | | |
| | | +-------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| | +VarLengthExpand(Into) | (KevinB)-[anon_0*]-(Al) |
26554 | 0 | 0 | | | | |
| | | +-------------------------------------------------------------
+----------------+------+---------+----------------+ | |
|
| | +Argument | KevinB, Al |
2 | 0 | 0 | 0 | 0/0 | 0.000 | Fused in Pipeline 4 |
| | +-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------

1035
+---------------------+
| +Apply | |
2 | 1 | 0 | | 0/0 | 0.026 | |
| |\ +-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| | +Optional | KevinB, Al |
2 | 1 | 0 | 4840 | 0/0 | 0.134 | In Pipeline 3 |
| | | +-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| | +ShortestPath | p = (KevinB)-[anon_0*]-(Al) WHERE length(p) > $autoint_2 |
1 | 1 | 1 | 1760 | | | In Pipeline 2 |
| | | +-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| | +Argument | KevinB, Al |
2 | 1 | 0 | 24680 | 0/0 | 0.056 | In Pipeline 1 |
| | +-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+
| +MultiNodeIndexSeek | RANGE INDEX KevinB:Person(name) WHERE name = $autostring_0, |
2 | 1 | 4 | 120 | 2/0 | 0.644 | In Pipeline 0 |
| | RANGE INDEX Al:Person(name) WHERE name = $autostring_1 |
| | | | | | |
+--------------------------+-------------------------------------------------------------
+----------------+------+---------+----------------+------------------------+-----------
+---------------------+

Total database accesses: 5, total allocated memory: 50152

The way the bigger exhaustive query plan works is by using Apply/Optional to ensure that when the fast
algorithm does not find any results, a null result is generated instead of simply stopping the result stream.
On top of this, the planner will issue an AntiConditionalApply, which will run the exhaustive search if the
path variable is pointing to null instead of a path.

An ErrorPlan operator will appear in the execution plan in cases where:

• dbms.cypher.forbid_exhaustive_shortestpath is set to true.

• The fast algorithm is not able to find the shortest path.

1036
Example 504. Prevent the exhaustive search from being used as a fallback

Query

MATCH
(KevinB:Person {name: 'Kevin Bacon'}),
(Al:Person {name: 'Al Pacino'}),
p = shortestPath((KevinB)-[*]-(Al))
WITH p
WHERE length(p) > 1
RETURN p

This query, just like the one above, needs to check that the whole path follows the predicate before
we know if it is valid or not. However, the inclusion of the WITH clause means that the query plan will
not include the fallback to the slower exhaustive search algorithm. Instead, any paths found by the
fast algorithm will subsequently be filtered, which may result in no answers being returned.

Query plan

Planner COST

Runtime PIPELINED

Runtime version 5.25

Batch size 128

+---------------------+-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------+
| Operator | Details | Estimated Rows
| Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline |
+---------------------+-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------+
| +ProduceResults | p | 1
| 1 | 0 | | 1/0 | 0.353 | |
| | +-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +Filter | length(p) > $autoint_2 | 1
| 1 | 0 | | 0/0 | 0.255 | |
| | +-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+ |
| +ShortestPath | p = (KevinB)-[anon_0*]-(Al) | 2
| 1 | 1 | 1760 | | | In Pipeline 1 |
| | +-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------+
| +MultiNodeIndexSeek | RANGE INDEX KevinB:Person(name) WHERE name = $autostring_0, | 2
| 1 | 4 | 120 | 2/0 | 0.371 | In Pipeline 0 |
| | RANGE INDEX Al:Person(name) WHERE name = $autostring_1 |
| | | | | | |
+---------------------+-------------------------------------------------------------+----------------
+------+---------+----------------+------------------------+-----------+---------------+

Total database accesses: 5, total allocated memory: 1824

1037
License
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

You are free to


Share
copy and redistribute the material in any medium or format

Adapt
remix, transform, and build upon the material

The licensor cannot revoke these freedoms as long as you follow the license terms.

Under the following terms


Attribution
You must give appropriate credit, provide a link to the license, and indicate if changes were made. You
may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or
your use.

NonCommercial
You may not use the material for commercial purposes.

ShareAlike
If you remix, transform, or build upon the material, you must distribute your contributions under the
same license as the original.

No additional restrictions
You may not apply legal terms or technological measures that legally restrict others from doing
anything the license permits.

Notices
You do not have to comply with the license for elements of the material in the public domain or where your
use is permitted by an applicable exception or limitation.

No warranties are given. The license may not give you all of the permissions necessary for your intended
use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the
material.

See https://creativecommons.org/licenses/by-nc-sa/4.0/ for further details. The full license text is available
at https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.

[18] The FLOAT type in Cypher always represents a 64-bit double-precision floating point number.

[19] The INTEGER type in Cypher always represents a 64-bit INTEGER.

[20] The INNER_TYPE cannot be a LIST type.

1038

You might also like