I have a pyspark data frame with two columns - features and label.
features
label
features is a sparse vector I have created after multiple transformations and finally used vectorassembler. I want to write this dataframe to s3 in libsvm format but I am struggling to get any leads on how to do that.
vectorassembler
Edit 1: Looking for a solution without converting the dataframe to RDD
1.4m articles
1.4m replys
5 comments
57.0k users