Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
305 views
in Technique[技术] by (71.8m points)

sql server - SQL "select where not in subquery" returns no results

Disclaimer: I have figured out the problem (I think), but I wanted to add this issue to Stack Overflow since I couldn't (easily) find it anywhere. Also, someone might have a better answer than I do.

I have a database where one table "Common" is referenced by several other tables. I wanted to see what records in the Common table were orphaned (i.e., had no references from any of the other tables).

I ran this query:

select *
from Common
where common_id not in (select common_id from Table1)
and common_id not in (select common_id from Table2)

I know that there are orphaned records, but no records were returned. Why not?

(This is SQL Server, if it matters.)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Update:

These articles in my blog describe the differences between the methods in more detail:


There are three ways to do such a query:

  • LEFT JOIN / IS NULL:

    SELECT  *
    FROM    common
    LEFT JOIN
            table1 t1
    ON      t1.common_id = common.common_id
    WHERE   t1.common_id IS NULL
    
  • NOT EXISTS:

    SELECT  *
    FROM    common
    WHERE   NOT EXISTS
            (
            SELECT  NULL
            FROM    table1 t1
            WHERE   t1.common_id = common.common_id
            )
    
  • NOT IN:

    SELECT  *
    FROM    common
    WHERE   common_id NOT IN
            (
            SELECT  common_id
            FROM    table1 t1
            )
    

When table1.common_id is not nullable, all these queries are semantically the same.

When it is nullable, NOT IN is different, since IN (and, therefore, NOT IN) return NULL when a value does not match anything in a list containing a NULL.

This may be confusing but may become more obvious if we recall the alternate syntax for this:

common_id = ANY
(
SELECT  common_id
FROM    table1 t1
)

The result of this condition is a boolean product of all comparisons within the list. Of course, a single NULL value yields the NULL result which renders the whole result NULL too.

We never cannot say definitely that common_id is not equal to anything from this list, since at least one of the values is NULL.

Suppose we have these data:

common

--
1
3

table1

--
NULL
1
2

LEFT JOIN / IS NULL and NOT EXISTS will return 3, NOT IN will return nothing (since it will always evaluate to either FALSE or NULL).

In MySQL, in case on non-nullable column, LEFT JOIN / IS NULL and NOT IN are a little bit (several percent) more efficient than NOT EXISTS. If the column is nullable, NOT EXISTS is the most efficient (again, not much).

In Oracle, all three queries yield same plans (an ANTI JOIN).

In SQL Server, NOT IN / NOT EXISTS are more efficient, since LEFT JOIN / IS NULL cannot be optimized to an ANTI JOIN by its optimizer.

In PostgreSQL, LEFT JOIN / IS NULL and NOT EXISTS are more efficient than NOT IN, sine they are optimized to an Anti Join, while NOT IN uses hashed subplan (or even a plain subplan if the subquery is too large to hash)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...