I am using spark-sql-2.4.1v
how to do various joins depend on the value of column
Sample data
val data = List(
("20", "score", "school", 14 ,12),
("21", "score", "school", 13 , 13),
("22", "rate", "school", 11 ,14)
)
val df = data.toDF("id", "code", "entity", "value1","value2")
+---+-----+------+------+------+
| id| code|entity|value1|value2|
+---+-----+------+------+------+
| 20|score|school| 14| 12|
| 21|score|school| 13| 13|
| 22| rate|school| 11| 14|
| 21| rate|school| 13| 12|
based the "code" column value i need to do join with various other tables
val rateDs = // val data1= List(
("22", 11 ,A),
("22", 14 ,B),
("20", 13 ,C),
("21", 12 ,C),
("21", 13 ,D)
)
val df = data1.toDF("id", "map_code","map_val")
val scoreDs = // scoreTable
if the "code" column value is "rate" i need to join with rateDs
if the "code" column value is "score" i need to join with scoreDs
how to handle these kind of things in spark ? any optimum way to achieve this?
Expected result for "rate" fields
+---+-----+------+------+------+
| id| code|entity|value1|value2|
+---+-----+------+------+------+
| 22| rate|school| A| B |
| 21| rate|school| D| C |
See Question&Answers more detail:
os