Your question is not very clear. But as per the shown inputs and the desired output, I believe what you want is simply join df_output_2
and df_input_1
on ID = Hier_2
to get the relation between ID
s then join with df_output_1
on ID = Hier_1
to get Sr
values :
df_output_2 = df_output_2.alias("out2").join(
df_input_1.alias("in1"),
col("out2.ID") == col("in1.Hier_2"), "left"
)
.join(
df_output_1.alias("out1"),
col("out1.ID") == col("in1.Hier_1"), "left"
)
.selectExpr("out2.Sr", "coalesce(out2.Parent_Sr, out1.Sr) as Parent_Sr", "out2.name", "out2.ID")
df_output_2.show()
#+---+---------+--------------------+---+
#| Sr|Parent_Sr| name| ID|
#+---+---------+--------------------+---+
#|998| 123|Pan is father of Fen|Pan|
#|665| 143|Man is father of Men|Man|
#|887| 456|Can is father of Den|Can|
#|554| 457|Van is father of Ren|Van|
#|776| 789|Dan is father of Qen|Dan|
#+---+---------+--------------------+---+
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…