Temporary tables are effectively the same as in-memory tables thanks to caching and asynchronous I/O, and the temporary table solution does not require any overhead for converting between SQL and PL/SQL.
Confirming the results
Comparing the two versions with RunStats, the temporary table version looks much worse. All that junk for the temporary table version in Run1, and only a little extra memory for the PL/SQL version in Run2. At first it seems like PL/SQL should be the clear winner.
Type Name Run1 (temp) Run2 (PLSQL) Diff
----- -------------------------------- ------------ ------------ ------------
...
STAT physical read bytes 81,920 0 -81,920
STAT physical read total bytes 81,920 0 -81,920
LATCH cache buffers chains 104,663 462 -104,201
STAT session uga memory 445,488 681,016 235,528
STAT KTFB alloc space (block) 2,097,152 0 -2,097,152
STAT undo change vector size 2,350,188 0 -2,350,188
STAT redo size 2,804,516 0 -2,804,516
STAT temp space allocated (bytes) 12,582,912 0 -12,582,912
STAT table scan rows gotten 15,499,845 0 -15,499,845
STAT session pga memory 196,608 19,857,408 19,660,800
STAT logical read bytes from cache 299,958,272 0 -299,958,272
But at the end of the day only the wall clock time matters. Both the loading and the querying steps run much faster with temporary tables.
The PL/SQL version can be improved by replacing the BULK COLLECT
with cast(collect(test_o(MOD(a, 10), '' || MOD(a, 12))) as test_t) INTO t
. But it's still significantly slower than the temporary table version.
Optimized Reads
Reading from the small temporary table only uses the buffer cache, which is in memory. Run only the query part many times, and watch how the consistent gets from cache
(memory) increase while the physical reads cache
(disk) stay the same.
select name, value
from v$sysstat
where name in ('db block gets from cache', 'consistent gets from cache',
'physical reads cache');
Optimized Writes
Ideally there would be no physical I/O, especially since the temporary table is ON COMMIT DELETE ROWS
. And it sounds like the next version of Oracle may introduce such a mechanism. But it doesn't matter much in this case, the disk I/O does not seem to slow things down.
Run the load step multiple times, and then run select * from v$active_session_history order by sample_time desc;
. Most of the I/O is BACKGROUND
, which means nothing is waiting on it. I assume the temporary table internal logic is just a copy of regular DML mechanisms. In general, new table data may need to be written to disk, if it's committed. Oracle may start working on it, for example by moving data from the log buffer to disk, but there is no rush until there is an actual COMMIT
.
Where does the PL/SQL time go?
I have no clue. Are there multiple context switches, or a single conversion between the SQL and PL/SQL engines? As far as I know none of the available metrics show the time spent on switching between SQL and PL/SQL.
We may never know exactly why PL/SQL code is slower. I don't worry about it too much. The general answer is, the vast majority of database work has to be done in SQL anyway. It would make a lot of sense if Oracle spent more time optimizing the core of their database, SQL, than the add-on language, PL/SQL.
Additional notes
For performance testing it can be helpful to remove the connect by
logic into a separate step. That SQL is a great trick for loading data, but it can be very slow and resource intensive. It's more realistic to load a sample table once with that trick, and then insert from that table.
I tried using the new Oracle 12c feature, temporary undo, and the new 18c feature, private temporary tables. Neither one improved performance over regular temporary tables.
I wouldn't bet on it, but I can see a way that the results would completely change as the data gets larger. The log buffer and the buffer cache can only get so large. And eventually that background I/O could add up and overwhelm some processes, turning the BACKGROUND
wait into a FOREGROUND
wait. On the other hand, there's only so much PGA memory for the PL/SQL solution, and then things crash.
Finally, this partially confirms my skepticism of "in-memory databases". Caching is nothing new, databases have been doing it for decades.