I have a data set with multiple columns. A function needs to be invoked to compute result using the data available within a row. So I used a case class with a method and created a data set using it. As example,
case class testCase(x: Double, a1: Array[Double], a2: Array[Double]) {
var someInt = 0
def myMethod1(): Unit = {...} // use x, a1 and a2
def myMethod2(): Unit = {...} // use x, a1 and a2
def result(): { return someInt }
It is called from the main()
as
val res = myDS.map(_.result()).toDF("result")
The problem I am facing is that while the code works correctly, no matter how I invoke, unlike for the other parts of the program, the above statement does not work concurrently. Irrespective of the number executors, cores and repartition
ing, only one instance of a method seems to work at time!
Any hints to what I should look at would be appreciated.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…