I am trying to join 2 tables in HIVE using a key and date from table B and the corresponding previous closest date record from table A at the time of join. For example: Below are the 2 input tables
<----------TABLE A-------------> <------------TABLE B------------>
A_id A_date changed_col B_id B_date B_value A_id
**** ****** *********** **** ****** ******* *****
A01 2017-03-20 ABC B01 2017-04-02 200 A01
A01 2017-04-01 XYZ B01 2017-04-04 500 A01
A01 2017-04-05 LLL B02 2017-04-07 900 A02
A02 2017-04-06 KKK B02 2017-04-30 800 A02
A02 2017-04-12 JJJ
However when I JOIN table B with Table A, it should look for the nearest lowest date in table A for the same key(A_id). Below is the expected output table:
B_id B_date A_id A_date changed_col B_value
**** ****** **** ****** *********** *******
B01 2017-02-04 A01 2017-01-04 XYZ 200
B01 2017-04-04 A01 2017-01-04 XYZ 500
B02 2017-04-07 A02 2017-04-06 KKK 900
B02 2017-04-30 A02 2017-04-12 JJJ 800
Any help is much appreciated. Thanks
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…