I have a data frame with following type
col1|col2|col3|col4
xxxx|yyyy|zzzz|[1111],[2222]
I want my output to be following type
col1|col2|col3|col4|col5
xxxx|yyyy|zzzz|1111|2222
My col4 is an array and I want to convert it to a separate column. What needs to be done?
I saw many answers with flatmap but they are increasing a row, I want just the tuple to be put in another column but in the same row
Following is my actual schema:
root
|-- PRIVATE_IP: string (nullable = true)
|-- PRIVATE_PORT: integer (nullable = true)
|-- DESTINATION_IP: string (nullable = true)
|-- DESTINATION_PORT: integer (nullable = true)
|-- collect_set(TIMESTAMP): array (nullable = true)
| |-- element: string (containsNull = true)
Also can please some one help me with explanation on both dataframes and RDD's
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…