I have a CSV of both textual and numerical data. I need to convert it to feature vector data in Spark (Double values). Is there any way to do that ?
I see some e.g where each keyword is mapped to some double value and use this to convert. However if there are multiple keywords, it is difficult to do this way.
Is there any other way out? I see Spark provides Extractors which will convert into feature vectors. Could someone please give an example?
48, Private, 105808, 9th, 5, Widowed, Transport-moving, Unmarried, White, Male, 0, 0, 40, United-States, >50K
42, Private, 169995, Some-college, 10, Married-civ-spouse, Prof-specialty, Husband, White, Male, 0, 0, 45, United-States, <=50K
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…