Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
207 views
in Technique[技术] by (71.8m points)

pandas - Adding stats code to a function in Python

Im relatively new to Python and trying to learn how to write functions. The answer to this post highlights how to get certain stats from a dataframe and I would like to use it in a function.

This is my attempt but it is not working with an AttributeError: 'SeriesGroupBy' object has no attribute 'test_for_B':

 def test_multi_match(df_in,test_val):
    test_for_B = df_in == test_val
    contigious_groups = ((df_in == test_val) & (df_in != df_in.shift())).cumsum() + 1
    counts = df_in.groupby(contigious_groups).test_for_B.sum()
    counts.value_counts() / contigious_groups.max()

Can someone please help put this code in a function I can re use on other data frames? Thanks.

Edit: Removed large attribute error now this has been answered.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here you go:

def repeat_stats(series, var):
    isvar = series == var
    wasntvar = series != series.shift()
    cont_grps = (isvar & wasntvar).cumsum()
    counts = isvar.loc[cont_grps.astype(bool)].groupby(cont_grps).sum()
    return counts.value_counts() / cont_grps.max()

repeat_stats(rng.initial_data, 'B')

3.0    0.5
2.0    0.5
Name: initial_data, dtype: float64

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...