I have a (Ruby on Rails 4) app and using a postgresql query that basically look into a Opportunity table, search randomly among the rows those where opportunity is 'available=true' and update these selected rows with ' available = false'.
Each time a user clicks on 'try an opportunity' the app uses the query below.
UPDATE opportunities s
SET opportunity_available = false
FROM (
SELECT id
FROM opportunities
WHERE deal_id = #{@deal.id}
AND opportunity_available
AND pg_try_advisory_xact_lock(id)
LIMIT 1
FOR UPDATE
) sub
WHERE s.id = sub.id
RETURNING s.prize_id, s.id;
// inspired by https://stackoverflow.com/questions/33128531/put-pg-try-advisory-xact-lock-in-a-nested-subquery
I've been struggling for 5 days but I have managed to understand now roughly how it's behaving (badly): now I need to know how to change it.
Indeed, the app "becomes a little crazy" and select (and update the row) exactly by the latest I updated.
Let me be clear on my process on how I create these opportunities
I as admin create 20 opportunities (rows) via my app as an admin
then in my admin panel, I create a prize which a quantity let's say prize id =45 with quantity = 4. the app will go in Opportunity table and fill randomly (this part works perfectly) 4 rows with prize_id= 45.
Now a user runs the app and as explained above, the app will not choose randomly a row but will begin always by the latest updated: it seems the takes one of the row with a prize, then another with a prize, then another and never those without prizes (with prized_id= empty)...
Now I made some manual experiments: I tried manually change the values on my table (via pgadmin), the weirdest thing happen: if I modify row 32, then row 45, then row 67, when the user tries to play again, guess what, the lines that are picked randomly are exactly the ones I updated in the reverse order: it will pick line 67 then line45 then line 32..
It won't even consider the other rows that could be chosen (all the others opportunities where available= true).
I also tried not to use for update or the 'pg_try_advisory_xact_lock(id)' line and it seems it still has the same issue.
As, as an admin, I create first the total 20 rows, THEN the 4 winning rows, they're the last to be udpated (even if on my pgadmin screen they stay on the same rows...maybe in the backround, postgresql is ordering them as the last updated consecutively and selecting them?) and that's kind of why as soon as one of 4 these winning rows is selected, then all the others follow.
To be clear I would be OK with the app going row by row to select each opportunity available (ex: row 3, then row 4, then row 5 as per what I see on pgadmin as rows are already attributed the prizes totally randomly). the problem is it's not doing this, it's taking often all the winning rows consecutively....
I'm speechless and no clue on how to break this pattern.
Note: this pattern is not happening continuously 100% of the time but very often: for example if i have more than 4 winning rows, it's like for 2 minutes if I keep clicking as a user it behaves as said here, then it stops and seems (or I might be wrong) to behave normally then again after & minute will again only select winning rows...
EDIT 1
Here is how prizes are injected inside Opportunity table (example whehn I create a prize which id is 21 and quantity of this prize= 3)=> it randomly send them (as far as I understand) but only where there is not already a prize_id (i.e. if opportunity has prize_id=empty, it can put it there)
SQL (2.4ms)
UPDATE "opportunities"
SET "prize_id" = 21
WHERE "opportunities"."id"
IN (
SELECT "opportunities"."id"
FROM "opportunities"
WHERE (deal_id = 341 AND prize_id IS NULL)
ORDER BY RANDOM()
LIMIT 3) //
(0.9ms) COMMIT
This SQL query is generated by a Rails gem (called Randumb: github.com/spilliton/randumb)
EDIT 2
I moved to another more precise question: Postgresql 9.4 - FASTEST query to select and update on large dataset (>30M rows) with heavy writes/reads and locks
Indeed I think the problem is really that I need a truly RANDOM pick and move away from ARBITRARY pick.
It was already said by Erwin here Advisory locks or NOWAIT to avoid waiting for locked rows? but now I understand what he meant ("Arbitrary" also matters because it implies that Postgresql will typically pick the same row for the same query arbitrarily, which makes lock contention a much bigger issue than truly random picks would. " postgresql is free to choose the fastest way to output the 'LIMIT 1' and it's picking always the same rows= those which have been upated the last ones. But I can't do with this as the latest updated are all Winning opportunities.
See Question&Answers more detail:
os