The most efficient way to achieve this:
- Run a single
UPDATE
per table.
- Only update nullable columns (not defined
NOT NULL
) with any actual empty string.
- Only update rows with any actual empty string.
- Leave other values unchanged.
This related answer has a plpgsql function that builds and runs the UPDATE
command using system catalog pg_attribute
automatically and safely for any given table:
Using the function f_empty2null()
from this answer, you can loop through selected tables like this:
DO
$do$
DECLARE
_tbl regclass;
BEGIN
FOR _tbl IN
SELECT c.oid::regclass
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r' -- only regular tables
AND n.nspname NOT LIKE 'pg_%' -- exclude system schemas
LOOP
RAISE NOTICE $$PERFORM f_empty2null('%');$$, _tbl;
-- PERFORM f_empty2null(_tbl); -- uncomment to prime the bomb
END LOOP;
END
$do$;
Careful! This updates all empty strings in all columns of all user tables in the DB. Be sure that's what you want or it might nuke your database.
You need UPDATE
privileges on all selected tables, of course.
As a child safety device I commented the payload.
You may have noted that I use the system catalogs directly, not the information schema (which would work, too). About this:
For repeated use
Here is an integrated solution for repeated use. Without safety devices:
CREATE OR REPLACE FUNCTION f_all_empty2null(OUT _tables int, OUT _rows int) AS
$func$
DECLARE
_typ CONSTANT regtype[] := '{text, bpchar, varchar, "char"}';
_sql text;
_row_ct int;
BEGIN
_tables := 0; _rows := 0;
FOR _sql IN
SELECT format('UPDATE %s SET %s WHERE %s'
, t.tbl
, string_agg(format($$%1$s = NULLIF(%1$s, '')$$, t.col), ', ')
, string_agg(t.col || $$ = ''$$, ' OR '))
FROM (
SELECT c.oid::regclass AS tbl, quote_ident(attname) AS col
FROM pg_namespace n
JOIN pg_class c ON c.relnamespace = n.oid
JOIN pg_attribute a ON a.attrelid = c.oid
WHERE n.nspname NOT LIKE 'pg_%' -- exclude system schemas
AND c.relkind = 'r' -- only regular tables
AND a.attnum >= 1 -- exclude tableoid & friends
AND NOT a.attisdropped -- exclude dropped columns
AND NOT a.attnotnull -- exclude columns defined NOT NULL!
AND a.atttypid = ANY(_typ) -- only character types
ORDER BY a.attnum
) t
GROUP BY t.tbl
LOOP
EXECUTE _sql;
GET DIAGNOSTICS _row_ct = ROW_COUNT; -- report nr. of affected rows
_tables := _tables + 1;
_rows := _rows + _row_ct;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM pg_temp.f_all_empty2null();
Returns:
_tables | _rows
---------+---------
23 | 123456
Note how I escaped both table and columns names properly!
c.oid::regclass AS tbl, quote_ident(attname) AS col
Consider:
Careful! Same warning as above.
Also consider the basic explanation in the answer I linked above: