I am hoping to get some help improving a method for UPSERTing rows passed in as an array. I'm on Postgres 11.4 with deployment on RDS. I'm got a lot of tables to sort out, but am starting with a simple table for experimentation:
BEGIN;
DROP TABLE IF EXISTS "data"."item" CASCADE;
CREATE TABLE IF NOT EXISTS "data"."item" (
"id" uuid NOT NULL DEFAULT NULL,
"marked_for_deletion" boolean NOT NULL DEFAULT false,
"name_" citext NOT NULL DEFAULT NULL,
CONSTRAINT item_id_pkey
PRIMARY KEY ("id")
);
CREATE INDEX item_marked_for_deletion_ix_bgin ON "data"."item" USING GIN("marked_for_deletion") WHERE marked_for_deletion = true;
ALTER TABLE "data"."item" OWNER TO "user_change_structure";
COMMIT;
The function, so far, looks like this:
DROP FUNCTION IF EXISTS data.item_insert_array (item[]);
CREATE OR REPLACE FUNCTION data.item_insert_array (data_in item[])
RETURNS int
AS $$
INSERT INTO item (
id,
marked_for_deletion,
name_)
SELECT
d.id,
d.marked_for_deletion,
d.name_
FROM unnest(data_in) d
ON CONFLICT(id) DO UPDATE SET
marked_for_deletion = EXCLUDED.marked_for_deletion,
name_ = EXCLUDED.name_;
SELECT cardinality(data_in); -- array_length() doesn't work. ˉ\_(ツ)_/ˉ
$$ LANGUAGE sql;
ALTER FUNCTION data.item_insert_array(item[]) OWNER TO user_bender;
And a call looks like this:
select * from item_insert_array(
array[
('2f888809-2777-524b-abb7-13df413440f5',true,'Salad fork'),
('f2924dda-8e63-264b-be55-2f366d9c3caa',false,'Melon baller'),
('d9ecd18d-34fd-5548-90ea-0183a72de849',true,'Fondue fork')
]::item[]
);
I'm trying to develop a system for UPSERT that is injection-safe and that performs well. I'll be replacing a more naive multi-value insert where the INSERT is composed completely on the client side. Meaning, I can't be certain that I'm not introducing defects when concatenating the text. (I asked about this here: Postgres bulk insert/update that's injection-safe. Perhaps a function that takes an array?)
I've gotten this far with the help of various excellent answers:
https://dba.stackexchange.com/questions/224785/pass-array-of-mixed-type-into-stored-function
https://dba.stackexchange.com/questions/131505/use-array-of-composite-type-as-function-parameter-and-access-it
https://dba.stackexchange.com/questions/225176/how-to-pass-an-array-to-a-plpgsql-function-with-variadic-parameter/
I'm not trying for the most complex version of all of this, for instance, I am fine with a single function per table, and fine that every array element has exactly the same format. I'll write code generators to build out everything I need, once I've got the basic pattern sorted out. So, I don't think that I need VARIADIC parameter lists, polymorphic elements, or everything-packaged-as-JSON. (Although I will need to insert JSON from time to time, that's just data.)
I can still use some remedial help with some questions:
Is the code above injection-safe, or do I need to rewrite it in PL/pgSQL to use something like FOREACH with an EXECUTE...USING or FORMAT or quote_literal, etc.?
I'm setting the input array to item[]. That's fine as I'm passing in all of the fields for this tiny table, but I won't always want to pass in all columns. I thought that I could use anyarray as the type within the function, but I can't figure out how to pass in an array in that scenario. Is there a generic array-of-stuff type? I can create custom types for each of these functions, but I'd rather not. Mainly, because I would only use the type in that one situation.
It seems like it would make sense to implement this as a procedure rather than a function so that I can handle the transaction within the function. Am I off base on that?
Any stylistic (or otherwise) on what to return? I'm returning a count now, which is at least a little useful.
I'm out over my skis a bit here, so any general comments will be much appreciated. For clarity, what I'm after is a schema for inserting multiple rows safely and with decent performance that, ideally, doesn't involve a custom type per function or COPY.
Thanks!
See Question&Answers more detail:
os