I have a very big dataframe that I can't put here unlikely for a demonstration but I am wondering if there is an explanation for a problem I have with my code using groupby()
method.
So, let df
be a numeric pandas.DataFrame
with (11815, 409)
as shape.
Let arr = df.as_matrix().astype(float)
.
Why I have this??
print df.groupby(["col1","col2"]).mean().reset_index().shape
>> (624, 409)
While:
print pandas.DataFrame(arr).groupby([df["col1"],df["col2"]]).mean().values().shape
>> (623, 409)
Note THAT this problem show up also with ["col1","col2","col3"]
BUT don't occur with ["col2","col3"]
,"col1"
, "col2"
and "col3"
.
So, there is a problem with 'col1' but what it can be??
Any explanation please??
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…