There is difference value_counts
return:
The resulting object will be in descending order so that the first element is the most frequently-occurring element.
but count
not, it sort output by index
(created by column in groupby('col')
).
df.groupby('colA').count()
is for aggregate all columns of df
by function count.
So it count values excluding NaN
s.
So if need count
only one column need:
df.groupby('colA')['colA'].count()
Sample:
df = pd.DataFrame({'colB':list('abcdefg'),
'colC':[1,3,5,7,np.nan,np.nan,4],
'colD':[np.nan,3,6,9,2,4,np.nan],
'colA':['c','c','b','a',np.nan,'b','b']})
print (df)
colA colB colC colD
0 c a 1.0 NaN
1 c b 3.0 3.0
2 b c 5.0 6.0
3 a d 7.0 9.0
4 NaN e NaN 2.0
5 b f NaN 4.0
6 b g 4.0 NaN
print (df['colA'].value_counts())
b 3
c 2
a 1
Name: colA, dtype: int64
print (df.groupby('colA').count())
colB colC colD
colA
a 1 1 1
b 3 2 2
c 2 2 1
print (df.groupby('colA')['colA'].count())
colA
a 1
b 3
c 2
Name: colA, dtype: int64
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…