Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
330 views
in Technique[技术] by (71.8m points)

Neo4j: MERGE creates duplicate nodes

My database model has users and MAC addresses. A user can have multiple MAC addresses, but a MAC can only belong to one user. If some user sets his MAC and that MAC is already linked to another user, the existing relationship is removed and a new relationship is created between the new owner and that MAC. In other words, a MAC moves between users.

This is a particular instance of the Cypher query I'm using to assign MAC addresses:

MATCH (new:User { Id: 2 })
MERGE (mac:MacAddress { Value: "D857EFEF1CF6" })
WITH new, mac
OPTIONAL MATCH ()-[oldr:MAC_ADDRESS]->(mac)
DELETE oldr
MERGE (new)-[:MAC_ADDRESS]->(mac)

The query runs fine in my tests, but in production, for some strange reason it sometimes creates duplicate MacAddress nodes (and a new relationship between the user and each of those nodes). That is, a particular user can have multiple MacAddress nodes with the same Value.

I can tell they are different nodes because they have different node ID's. I'm also sure the Values are exactly the same because I can do a collect(distinct mac.Value) on them and the result is a collection with one element. The query above is the only one in the code that creates MacAddress nodes.

I'm using Neo4j 2.1.2. What's going on here?

Thanks, Jan

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Are you sure this is the entirety of the queries you're running? MERGE has this really common pitfall where it merges everything that you give it. So here's what people expect:

neo4j-sh (?)$ MERGE (mac:MacAddress { Value: "D857EFEF1CF6" });
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 1
Properties set: 1
Labels added: 1
1650 ms
neo4j-sh (?)$ MERGE (mac:MacAddress { Value: "D857EFEF1CF6" });
+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
17 ms
neo4j-sh (?)$ match (mac:MacAddress { Value: "D857EFEF1CF6" }) return count(mac);
+------------+
| count(mac) |
+------------+
| 1          |
+------------+
1 row
200 ms

So far, so good. That's what we expect. Now watch this:

neo4j-sh (?)$ MERGE (mac:MacAddress { Value: "D857EFEF1CF6" })-[r:foo]->(b:SomeNode {label: "Foo!"});
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 2
Relationships created: 1
Properties set: 2
Labels added: 2
178 ms
neo4j-sh (?)$ match (mac:MacAddress { Value: "D857EFEF1CF6" }) return count(mac);                    
+------------+
| count(mac) |
+------------+
| 2          |
+------------+
1 row
2 ms

Wait, WTF happened here? We specified only the same MAC address again, why is a duplicate created?

The documentation on MERGE specifies that "MERGE will not partially use existing patterns?—?it’s all or nothing. If partial matches are needed, this can be accomplished by splitting a pattern up into multiple MERGE clauses". So because when we run this path MERGE the whole path doesn't already exist, it creates everything in it, including a duplicate mac address node.

There are frequently questions about duplicated nodes created by MERGE, and 99 times out of 100, this is what's going on.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...