Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

python - Why can't I apply shift from within a pandas function?

I am trying to build a function that uses .shift() but it is giving me an error. Consider this:

In [40]:

data={'level1':[20,19,20,21,25,29,30,31,30,29,31],
      'level2': [10,10,20,20,20,10,10,20,20,10,10]}
index= pd.date_range('12/1/2014', periods=11)
frame=DataFrame(data, index=index)
frame

Out[40]:
            level1 level2
2014-12-01  20  10
2014-12-02  19  10
2014-12-03  20  20
2014-12-04  21  20
2014-12-05  25  20
2014-12-06  29  10
2014-12-07  30  10
2014-12-08  31  20
2014-12-09  30  20
2014-12-10  29  10
2014-12-11  31  10

A normal function works fine. To demonstrate I calculate the same result twice, using the direct and function approach:

In [63]:
frame['horizontaladd1']=frame['level1']+frame['level2']#works

def horizontaladd(x):
    test=x['level1']+x['level2']
    return test
frame['horizontaladd2']=frame.apply(horizontaladd, axis=1)
frame
Out[63]:
            level1 level2 horizontaladd1 horizontaladd2
2014-12-01  20  10  30  30
2014-12-02  19  10  29  29
2014-12-03  20  20  40  40
2014-12-04  21  20  41  41
2014-12-05  25  20  45  45
2014-12-06  29  10  39  39
2014-12-07  30  10  40  40
2014-12-08  31  20  51  51
2014-12-09  30  20  50  50
2014-12-10  29  10  39  39
2014-12-11  31  10  41  41

But while directly applying shift works, in a function it doesn't work:

frame['verticaladd1']=frame['level1']+frame['level1'].shift(1)#works

def verticaladd(x):
    test=x['level1']+x['level1'].shift(1)
    return test
frame.apply(verticaladd)#error

results in

KeyError: ('level1', u'occurred at index level1')

I also tried applying to a single column which makes more sense in my mind, but no luck:

def verticaladd2(x):
    test=x-x.shift(1)
    return test
frame['level1'].map(verticaladd2)#error, also with apply

error:

AttributeError: 'numpy.int64' object has no attribute 'shift'

Why not call shift directly? I need to embed it into a function to calculate multiple columns at the same time, along axis 1. See related question Ambiguous truth value with boolean logic

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Try passing the frame to the function, rather than using apply (I am not sure why apply doesn't work, even column-wise):

def f(x):
    x.level1 
    return x.level1 + x.level1.shift(1)

f(frame)

returns:

2014-12-01   NaN
2014-12-02    39
2014-12-03    39
2014-12-04    41
2014-12-05    46
2014-12-06    54
2014-12-07    59
2014-12-08    61
2014-12-09    61
2014-12-10    59
2014-12-11    60
Freq: D, Name: level1, dtype: float64

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...