Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
445 views
in Technique[技术] by (71.8m points)

hadoop - Hive command to execute NOT IN clause

I have two tables,tab1 & tab2.

tab1(T1)  tab2(T2)
a1         b1
b1         c1
c1         f1
d1         g1

I am looking for the values from table T1 that are not present in T2. In this case, the output should be a1 d1

I have tried with the following query but couldn't get the right solution.

select distinct tab1.T1 from tab1 left semi join tab2 on (tab1.T1!=tab2.T2);
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
SELECT t1.str
FROM tab1 t1 
LEFT OUTER JOIN tab2 t2 ON t1.str = t2.str
WHERE t2.str IS NULL;

Result:

OK
a1
d1

"Why is the t2.str is null condition there": Left outer joins ensure that all values from the first table are included in the results. So then what happens when there are no values in the second table: in that case all of the columns from the second table are reported as null.

So in the case above we are searching precisely for the condition that the second table entries are missing - and thus we:

  • Choose one of the never-empty (aka not null) columns from table two.
  • So: is number an always-present column? If not then please choose another one
  • Specify the condition "table1-alias"."table1-never-null-column" = null. That means that the record is actually not present in the join condition - and thus we found the records existing only in table 1.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...