Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
321 views
in Technique[技术] by (71.8m points)

python - Panda DataFrame.groupby() wierd behaviour with numpy array

I have a very big dataframe that I can't put here unlikely for a demonstration but I am wondering if there is an explanation for a problem I have with my code using groupby() method.

So, let df be a numeric pandas.DataFrame with (11815, 409) as shape.

Let arr = df.as_matrix().astype(float).

Why I have this??

print df.groupby(["col1","col2"]).mean().reset_index().shape
>> (624, 409)

While:

print pandas.DataFrame(arr).groupby([df["col1"],df["col2"]]).mean().values().shape
>> (623, 409)

Note THAT this problem show up also with ["col1","col2","col3"] BUT don't occur with ["col2","col3"],"col1", "col2" and "col3".

So, there is a problem with 'col1' but what it can be??

Any explanation please??

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...