I have a spark DataFrame which is grouped by a column aggregated with a count:
df.groupBy('a').agg(count("a")).show
+---------+----------------+
|a |count(a) |
+---------+----------------+
| null| 0|
| -90| 45684|
+---------+----------------+
df.select('a').filter('aisNull').count
returns
warning: there was one feature warning; re-run with -feature for details
res9: Long = 26834
which clearly shows that the null values were not counted initially.
What is the reason for this behaviour? I would have expected (if null
at all is contained in the grouping result) to properly see the counts.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…