Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
483 views
in Technique[技术] by (71.8m points)

scala - Unable to find Encode[Char] while using flatMap with toCharArray in spark

import spark.implicits._
import org.apache.spark.sql.functions._
var names = Seq("ABC","XYZ").toDF("names")
var data = names.flatMap(name=>name.getString(0).toCharArray).map(rec=> 
                              (rec,1)).rdd.reduce((x,y)=>('S',x._2 + y._2))

ERROR: Error:(20, 27) Unable to find encoder for type Char. An implicit Encoder[Char] is needed to store Char instances in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases. var data = names.flatMap(name=>name.getString(0).toCharArray).map(rec=>(rec,1)).rdd.reduce((x,y)=>('S',x._2 + y._2))

question from:https://stackoverflow.com/questions/65646674/unable-to-find-encodechar-while-using-flatmap-with-tochararray-in-spark

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can convert the dataframe to RDD first before doing the flatMap and map operations:

var data = names.rdd
                .flatMap(name => name.getString(0).toCharArray)
                .map(rec => (rec, 1))
                .reduce((x, y) => ('S', x._2 + y._2))

which will return 6, because you're just counting the number of chars in the first column of the dataframe. Not sure if this is your desired output.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...