sql - INSERT or SELECT strategy to always return a row?

Question

Welcome To Ask or Share your Answers For Others

sql - INSERT or SELECT strategy to always return a row?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

sql - INSERT or SELECT strategy to always return a row?

Using Postgres 9.6, I have followed the strategy recommended in https://stackoverflow.com/a/40325406/435563 to do an INSERT or SELECT and return the resulting id:

with ins as (
  insert into prop (prop_type, norm, hash, symbols)
  values (
    $1, $2, $3, $4
  ) on conflict (hash) do
    update set prop_type = 'jargon' where false
  returning id)
select id from ins
union all
select id from prop where hash = $3

However, sometimes this returns nothing. I would have expected it to return a row no matter what. How can I fix it to insure it always returns an id?

NB, despite not returning a row, the row does seem to exist on inspection. I believe the problem may be related to trying to add the same record via two sessions simultaneously.

The table in question is defined as:

create table prop (
  id serial primary key,
  prop_type text not null references prop_type(name),
  norm text not null,
  hash text not null unique,
  symbols jsonb
);

Data:

EDT DETAIL:  parameters: $1 = 'jargon', $2 = 'j2', $3 = 'lXWkZSmoSE0mZ+n4xpWB', $4 = '[]'

If I change prop_type = 'jargon' to prop_type = 'foo' it works! It would seem the lock isn't taken if the expression wouldn't change anything even given the where false clause. Does this really need to depend on my guessing a value that wouldn't be in the row, though? Or is there a better way to ensure you get the lock?

--- UPDATE ---

The overall situation is that the application tried to save a directed acyclic graph using a connection pool (...with autocommit), and was using this query to get id while winnowing out duplications. [Turns out that much smarter is to use a transaction and just serialize to one connection. But the behavior when there is contention here is odd.]

The foreign key constraint doesn't seem to affect the insert -- e.g.:

create table foo(i int unique, prop_id int references prop(id));
insert into foo values (1, 208);
insert into foo values (1, 208) 
on conflict (i) do update set prop_id = 208 where false;
--> INSERT 0 0
insert into foo values (1, 208) 
on conflict (i) do update set prop_id = -208 where false;
--> INSERT 0 0

Note one with valid fk 208, the other with invalid -208. If I connect a select onto either of these with the full pattern, then in situations without contention, they both return i = 1 as expected.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T21:39:35+0000

Your observation seems impossible. The above command should always return an id, either for the newly inserted row or for the pre-existing row. Concurrent writes cannot mess with this since existing conflicting rows are locked. Explanation in this related answer:

How to use RETURNING with ON CONFLICT in PostgreSQL?

Unless an exception is raised, of course. You get an error message instead of a result in that case. Did you check that? Do you have error-handling in place? (In case your app somehow discards error messages: 1) Fix that. 2) There is an additional entry in the DB log with default logging settings.)

I do see a FK constraint in your table definition:

prop_type text not null references prop_type(name),

If you try to insert a row that violates the constraint, that's exactly what happens. If there is no row with name = 'jargon' in table prop_type, that's what you get:

ERROR:  insert or update on table "prop" violates foreign key constraint "prop_prop_type_fkey"
DETAIL:  Key (prop_type)=(jargon) is not present in table "prop_type".

Demo:

dbfiddle here

Your observation would fit the crime:

If I change prop_type = 'jargon' to prop_type = 'foo' it works!

But your explanation is based on misconceptions:

It would seem the lock isn't taken if the expression wouldn't change anything even given the where false clause.

That's not how Postgres works. The lock is taken either way (explanation in above linked answer), and the Postgres locking mechanism never even considers how the new row compares to the old.

Does this really need to depend on my guessing a value that wouldn't be in the row, though? Or is there a better way to ensure you get the lock?

No. And no.

If missing FK values are indeed the problem, you might add missing (distinct) values in a single statement with rCTEs. Simple for single-row inserts like you demonstrate, but works for inserting many rows at once, too. Related:

Categories

sql - INSERT or SELECT strategy to always return a row?

sql - INSERT or SELECT strategy to always return a row?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags