I have an rdd of (String,Int) which is sorted by key
val data = Array(("c1",6), ("c2",3),("c3",4))
val rdd = sc.parallelize(data).sortByKey
Now I want to start the value for the first key with zero and the subsequent keys as sum of the previous keys.
Eg: c1 = 0 , c2 = c1's value , c3 = (c1 value +c2 value) , c4 = (c1+..+c3 value)
expected output:
(c1,0), (c2,6), (c3,9)...
Is it possible to achieve this ?
I tried it with map but the sum is not preserved inside the map.
var sum = 0 ;
val t = keycount.map{ x => { val temp = sum; sum = sum + x._2 ; (x._1,temp); }}
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…