ABSOLUTELY. A hash match would be a huge improvement. Creating the hash on the smaller 19,223 row table then probing into it with the larger 65,991 row table is a much smaller operation than the nested loop requiring 1,268,544,993 row comparisons.
The only reason the server would choose the nested loops is that it badly underestimated the number of rows involved. Do your tables have statistics on them, and if so, are they being updated regularly? Statistics are what enable the server to choose good execution plans.
If you've properly addressed statistics and are still having a problem you could force it to use a HASH join like so:
SELECT *
FROM
TableA A -- The smaller table
LEFT HASH JOIN TableB B -- the larger table
Please note that the moment you do this it will also force the join order. This means you have to arrange all your tables correctly so that their join order makes sense. Generally you would examine the execution plan the server already has and alter the order of your tables in the query to match. If you're not familiar with how to do this, the basics are that each "left" input comes first, and in graphical execution plans, the left input is the lower one. A complex join involving many tables may have to group joins together inside parentheses, or use RIGHT JOIN
in order to get the execution plan to be optimal (swap left and right inputs, but introduce the table at the correct point in the join order).
It is generally best to avoid using join hints and forcing join order, so do whatever else you can first! You could look into the indexes on the tables, fragmentation, reducing column sizes (such as using varchar
instead of nvarchar
where Unicode is not required), or splitting the query into parts (insert to a temp table first, then join to that).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…