Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
111 views
in Technique[技术] by (71.8m points)

python - How to find since how many days the value of a column is True?

Consider the following DataFrame returned by the following dictionary(code given below):

I want to create a new column in this DataFrame which tells us since how many days the values continuously True. (for each ticker i.e. groupby(ticker)).

For e.g. the values that should be there in the new column are written as comments in the below code (for the first few rows). Please comment if you've any doubts regarding the output that's needed:

{'DaysWithGain': {(Timestamp('2019-10-01 04:00:00+0000', tz='UTC'),
      'AAPL'): True,
     (Timestamp('2019-10-01 10:00:00+0000', tz='UTC'), 'AAPL'): True, #1
     (Timestamp('2019-10-01 10:00:00+0000', tz='UTC'), 'FSLY'): False, #0
     (Timestamp('2019-10-01 10:00:00+0000', tz='UTC'), 'LVGO'): False, #0
     (Timestamp('2019-10-01 10:00:00+0000', tz='UTC'), 'SHOP'): True, #1
     (Timestamp('2019-10-01 10:00:00+0000', tz='UTC'), 'UPLD'): False, #0
     (Timestamp('2019-10-01 10:00:00+0000', tz='UTC'), 'ZM'): True, #1
     (Timestamp('2019-10-01 16:00:00+0000', tz='UTC'), 'AAPL'): True, #1
     (Timestamp('2019-10-01 16:00:00+0000', tz='UTC'), 'FSLY'): False,#0
     (Timestamp('2019-10-01 16:00:00+0000', tz='UTC'), 'LVGO'): False, #0
     (Timestamp('2019-10-01 16:00:00+0000', tz='UTC'), 'SHOP'): True, #1
     (Timestamp('2019-10-01 16:00:00+0000', tz='UTC'), 'UPLD'): False, #0
     (Timestamp('2019-10-01 16:00:00+0000', tz='UTC'), 'ZM'): True, #1
     (Timestamp('2019-10-01 22:00:00+0000', tz='UTC'), 'AAPL'): True, #1
     (Timestamp('2019-10-01 22:00:00+0000', tz='UTC'), 'FSLY'): False, #0
     (Timestamp('2019-10-01 22:00:00+0000', tz='UTC'), 'LVGO'): False, #0
     (Timestamp('2019-10-01 22:00:00+0000', tz='UTC'), 'SHOP'): True, # 1
     (Timestamp('2019-10-01 22:00:00+0000', tz='UTC'), 'UPLD'): False, #0
     (Timestamp('2019-10-01 22:00:00+0000', tz='UTC'), 'ZM'): True,#1
     (Timestamp('2019-10-02 04:00:00+0000', tz='UTC'), 'AAPL'): True,#2
     (Timestamp('2019-10-02 04:00:00+0000', tz='UTC'), 'FSLY'): False,#0
     (Timestamp('2019-10-02 04:00:00+0000', tz='UTC'), 'LVGO'): False,#0
     (Timestamp('2019-10-02 04:00:00+0000', tz='UTC'), 'SHOP'): True,#2
     (Timestamp('2019-10-02 04:00:00+0000', tz='UTC'), 'UPLD'): False, #0
     (Timestamp('2019-10-02 04:00:00+0000', tz='UTC'), 'ZM'): True, #2
     (Timestamp('2019-10-02 10:00:00+0000', tz='UTC'), 'AAPL'): True, #2
     (Timestamp('2019-10-02 10:00:00+0000', tz='UTC'), 'FSLY'): False, #0
     (Timestamp('2019-10-02 10:00:00+0000', tz='UTC'), 'LVGO'): False,#0
     (Timestamp('2019-10-02 10:00:00+0000', tz='UTC'), 'SHOP'): True, #0
     (Timestamp('2019-10-02 10:00:00+0000', tz='UTC'), 'UPLD'): False, #0
     (Timestamp('2019-10-02 10:00:00+0000', tz='UTC'), 'ZM'): True,#2
     (Timestamp('2019-10-02 16:00:00+0000', tz='UTC'), 'AAPL'): True,#2
     (Timestamp('2019-10-02 16:00:00+0000', tz='UTC'), 'FSLY'): False,#0
     (Timestamp('2019-10-02 16:00:00+0000', tz='UTC'), 'LVGO'): False,#0
     (Timestamp('2019-10-02 16:00:00+0000', tz='UTC'), 'SHOP'): True,#2
     (Timestamp('2019-10-02 16:00:00+0000', tz='UTC'), 'UPLD'): False,#0
     (Timestamp('2019-10-02 16:00:00+0000', tz='UTC'), 'ZM'): True, # 2
     (Timestamp('2019-10-02 22:00:00+0000', tz='UTC'), 'AAPL'): True, #2
     (Timestamp('2019-10-02 22:00:00+0000', tz='UTC'), 'FSLY'): False,#0
     (Timestamp('2019-10-02 22:00:00+0000', tz='UTC'), 'LVGO'): False, #0
     (Timestamp('2019-10-02 22:00:00+0000', tz='UTC'), 'SHOP'): True, #2
     (Timestamp('2019-10-02 22:00:00+0000', tz='UTC'), 'UPLD'): False,#0
     (Timestamp('2019-10-02 22:00:00+0000', tz='UTC'), 'ZM'): True,#2
     (Timestamp('2019-10-03 04:00:00+0000', tz='UTC'), 'AAPL'): False,
     (Timestamp('2019-10-03 04:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-03 04:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-03 04:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-03 04:00:00+0000', tz='UTC'), 'UPLD'): False,
     (Timestamp('2019-10-03 04:00:00+0000', tz='UTC'), 'ZM'): True,
     (Timestamp('2019-10-03 10:00:00+0000', tz='UTC'), 'AAPL'): False,
     (Timestamp('2019-10-03 10:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-03 10:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-03 10:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-03 10:00:00+0000', tz='UTC'), 'UPLD'): False,
     (Timestamp('2019-10-03 10:00:00+0000', tz='UTC'), 'ZM'): True,
     (Timestamp('2019-10-03 16:00:00+0000', tz='UTC'), 'AAPL'): False,
     (Timestamp('2019-10-03 16:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-03 16:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-03 16:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-03 16:00:00+0000', tz='UTC'), 'UPLD'): False,
     (Timestamp('2019-10-03 16:00:00+0000', tz='UTC'), 'ZM'): True,
     (Timestamp('2019-10-03 22:00:00+0000', tz='UTC'), 'AAPL'): False,
     (Timestamp('2019-10-03 22:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-03 22:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-03 22:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-03 22:00:00+0000', tz='UTC'), 'UPLD'): False,
     (Timestamp('2019-10-03 22:00:00+0000', tz='UTC'), 'ZM'): True,
     (Timestamp('2019-10-04 04:00:00+0000', tz='UTC'), 'AAPL'): True,
     (Timestamp('2019-10-04 04:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-04 04:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-04 04:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-04 04:00:00+0000', tz='UTC'), 'UPLD'): False,
     (Timestamp('2019-10-04 04:00:00+0000', tz='UTC'), 'ZM'): True,
     (Timestamp('2019-10-04 10:00:00+0000', tz='UTC'), 'AAPL'): True,
     (Timestamp('2019-10-04 10:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-04 10:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-04 10:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-04 10:00:00+0000', tz='UTC'), 'UPLD'): False,
     (Timestamp('2019-10-04 10:00:00+0000', tz='UTC'), 'ZM'): True,
     (Timestamp('2019-10-04 16:00:00+0000', tz='UTC'), 'AAPL'): True,
     (Timestamp('2019-10-04 16:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-04 16:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-04 16:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-04 16:00:00+0000', tz='UTC'), 'UPLD'): False,
     (Timestamp('2019-10-04 16:00:00+0000', tz='UTC'), 'ZM'): False,
     (Timestamp('2019-10-04 22:00:00+0000', tz='UTC'), 'AAPL'): True,
     (Timestamp('2019-10-04 22:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-04 22:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-04 22:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-04 22:00:00+0000', tz='UTC'), 'UPLD'): False,
     (Timestamp('2019-10-04 22:00:00+0000', tz='UTC'), 'ZM'): False,
     (Timestamp('2019-10-07 04:00:00+0000', tz='UTC'), 'AAPL'): True,
     (Timestamp('2019-10-07 04:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-07 04:00:00+0000', tz='UTC'), 'LVGO'): False,
     (Timestamp('2019-10-07 04:00:00+0000', tz='UTC'), 'SHOP'): True,
     (Timestamp('2019-10-07 04:00:00+0000', tz='UTC'), 'UPLD'): True,
     (Timestamp('2019-10-07 04:00:00+0000', tz='UTC'), 'ZM'): False,
     (Timestamp('2019-10-07 10:00:00+0000', tz='UTC'), 'AAPL'): True,
     (Timestamp('2019-10-07 10:00:00+0000', tz='UTC'), 'FSLY'): False,
     (Timestamp('2019-10-07 10:00:00+0000', tz='UTC'), 'LVGO'): False}}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This requires a custom aggregation function. The following function takes a list-like series of True/False values and returns the number of True's at the end:

from functools import reduce
def num_last_true_vals(series):
    return reduce(lambda c,v: (c+1)*v, series, 0)

let's check. The following call

(
num_last_true_vals([True,True,True,True]), 
num_last_true_vals([True,False,True,True]),
num_last_true_vals([True,False,False,True]),
num_last_true_vals([True,False,True,False]),
)

returns

(4, 2, 1, 0)

as expected

Now on to your dataset. It is not very interesting in the sense that for each ticker all the DaysWithGain values are either all True or all False. So I modify it a bit to make sure the solution works as promised. With dd being the dictionary you provided, we do

import pandas as pd
from pandas import Timestamp
df = pd.DataFrame(dd)
df.loc[(Timestamp('2019-10-03 10:00:00+00:00'),'AAPL'), 'DaysWithGain'] = False
df = df.sort_index(level=0)

note that we set one of the entries for 'AAPL' to False. Also note we sort by timestamp for good measure

Now on to the main act, using our custom num_last_true_vals function:

df.groupby(level=1).agg({'DaysWithGain':num_last_true_vals})

produces


        DaysWithGain
AAPL    8
FSLY    0
LVGO    0
SHOP    16
UPLD    0
ZM      16

where the column DaysWithGain returns the number of timestamps at the end that were True

Edit after comments by OP

You can use the same function on the rolling basis

so the following should work. Note the last couple of bits are for pretty-printing, you may decide to skip them

df2 = (df.groupby(level=1)
        .apply(lambda d: d.assign(ts_since_true = d['DaysWithGain']
            .rolling(window=1000, min_periods=1)
            .apply(num_last_true_vals)
        ))
        .reset_index()
        .sort_values(['level_1','level_0'])
    )

this produces (with your recently-edited dictionary)

    level_0                    level_1    DaysWithGain      ts_since_true
--  -------------------------  ---------  --------------  ---------------
 0  2019-10-01 04:00:00+00:00  AAPL       True                          1
 1  2019-10-01 10:00:00+00:00  AAPL       True                          2
 7  2019-10-01 16:00:00+00:00  AAPL       True                          3
13  2019-10-01 22:00:00+00:00  AAPL       True                          4
19  2019-10-02 04:00:00+00:00  AAPL       True                          5
25  2019-10-02 10:00:00+00:00  AAPL       True                          6
31  2019-10-02 16:00:00+00:00  AAPL       True                          7
37  2019-10-02 22:00:00+00:00  AAPL       True                          8
43  2019-10-03 04:00:00+00:00  AAPL       False                         0
49  2019-10-03 10:00:00+00:00  AAPL       False                         0
55  2019-10-03 16:00:00+00:00  AAPL       False                         0
61  2019-10-03 22:00:00+00:00  AAPL       False                         0
67  2019-10-04 04:00:00+00:00  AAPL       True                          1
73  2019-10-04 10:00:00+00:00  AAPL       True                          2
79  2019-10-04 16:00:00+00:00  AAPL       True                          3
85  2019-10-04 22:00:00+00:00  AAPL       True                          4
91  2019-10-07 04:00:00+00:00  AAPL       True                          5
97  2019-10-07 10:00:00+00:00  AAPL       True                          6
 2  2019-10-01 10:00:00+00:00  FSLY       False                         0
 8  2019-10-01 16:00:00+00:00  FSLY       False                         0
14  2019-10-01 22:00:00+00:00  FSLY       False                         0
20  2019-10-02 04:00:00+00:00  FSLY       False                         0
26  2019-10-02 10:00:00+00:00  FSLY       False                         0
32  2019-10-02 16:00:00+00:00  FSLY       False                         0
38  2019-10-02 22:00:00+00:00  FSLY       False                         0
44  2019-10-03 04:00:00+00:00  FSLY       False                         0
50  2019-10-03 10:00:00+00:00  FSLY       False                         0
56  2019-10-03 16:00:00+00:00  FSLY       False                         0
62  2019-10-03 22:00:00+00:00  FSLY       False                         0
68  2019-10-04 04:00:00+00:00  FSLY       False                         0
74  2019-10-04 10:00:00+00:00  FSLY       False                         0
80  2019-10-04 16:00:00+00:00  FSLY       False                         0
86  2019-10-04 22:00:00+00:00  FSLY       False                         0
92  2019-10-07 04:00:00+00:00  FSLY       False                         0
98  2019-10-07 10:00:00+00:00  FSLY       False                         0
 3  2019-10-01 10:00:00+00:00  LVGO       False                         0
 9  2019-10-01 16:00:00+00:00  LVGO       False                         0
15  2019-10-01 22:00:00+00:00  LVGO       False                         0
21  2019-10-02 04:00:00+00:00  LVGO       False                         0
27  2019-10-02 10:00:00+00:00  LVGO       False                         0
33  2019-10-02 16:00:00+00:00  LVGO       False                         0
39  2019-10-02 22:00:00+00:00  LVGO       False                         0
45  2019-10-03 04:00:00+00:00  LVGO       False                         0
51  2019-10-03 10:00:00+00:00  LVGO       False                         0
57  2019-10-03 16:00:00+00:00  LVGO       False                         0
63  2019-10-03 22:00:00+00:00  LVGO       False                         0
69  2019-10-04 04:00:00+00:00  LVGO       False                         0
75  2019-10-04 10:00:00+00:00  LVGO       False                         0
81  2019-10-04 16:00:00+00:00  LVGO       False                         0
87  2019-10-04 22:00:00+00:00  LVGO       False                         0
93  2019-10-07 04:00:00+00:00  LVGO       False                         0
99  2019-10-07 10:00:00+00:00  LVGO       False                         0
 4  2019-10-01 10:00:00+00:00  SHOP       True                          1
10  2019-10-01 16:00:00+00:00  SHOP       True                          2
16  2019-10-01 22:00:00+00:00  SHOP       True                          3
22  2019-10-02 04:00:00+00:00  SHOP       True                          4
28  2019-10-02 10:00:00+00:00  SHOP       True                          5
34  2019-10-02 16:00:00+00:00  SHOP       True                          6
40  2019-10-02 22:00:00+00:00  SHOP       True                          7
46  2019-10-03 04:00:00+00:00  SHOP       True                          8
52  2019-10-03 10:00:00+00:00  SHOP       True                          9
58  2019-10-03 16:00:00+00:00  SHOP       True                         10
64  2019-10-03 22:00:00+00:00  SHOP       True                         11
70  2019-10-04 04:00:00+00:00  SHOP       True                         12
76  2019-10-04 10:00:00+00:00  SHOP       True                         13
82  2019-10-04 16:00:00+00:00  SHOP       True                         14
88  2019-10-04 22:00:00+00:00  SHOP       True                         15
94  2019-10-07 04:00:00+00:00  SHOP       True                         16
 5  2019-10-01 10:00:00+00:00  UPLD       False                         0
11  2019-10-01 16:00:00+00:00  UPLD       False                         0
17  2019-10-01 22:00:00+00:00  UPLD       False                         0
23  2019-10-02 04:00:00+00:00  UPLD       False                         0
29  2019-10-02 10:00:00+00:00  UPLD       False                         0
35  2019-10-02 16:00:00+00:00  UPLD       False                         0
41  2019-10-02 22:00:00+00:00  UPLD       False                         0
47  2019-10-03 04:00:00+00:00  UPLD       False                         0
53  2019-10-03 10:00:00+00:00  UPLD       False                         0
59  2019-10-03 16:00:00+00:00  UPLD       False                         0
65  2019-10-03 22:00:00+00:00  UPLD       False                         0
71  2019-10-04 04:00:00+00:00  UPLD       False                         0
77  2019-10-04 10:00:00+00:00  UPLD       False                         0
83  2019-10-04 16:00:00+00:00  UPLD       False                         0
89  2019-10-04 22:00:00+00:00  UPLD       False                         0
95  2019-10-07 04:00:00+00:00  UPLD       True                          1
 6  2019-10-01 10:00:00+00:00  ZM         True                          1
12  2019-10-01 16:00:00+00:00  ZM         True                          2
18  2019-10-01 22:00:00+00:00  ZM         True                          3
24  2019-10-02 04:00:00+00:00  ZM         True                          4
30  2019-10-02 10:00:00+00:00  ZM         True                          5
36  2019-10-02 16:00:00+00:00  ZM         True                          6
42  2019-10-02 22:00:00+00:00  ZM         True                          7
48  2019-10-03 04:00:00+00:00  ZM         True                          8
54  2019-10-03 10:00:00+00:00  ZM         True                          9
60  2019-10-03 16:00:00+00:00  ZM         True                         10
66  2019-10-03 22:00:00+00:00  ZM         True                         11
72  2019-10-04 04:00:00+00:00  ZM         True                         12
78  2019-10-04 10:00:00+00:00  ZM         True                         13
84  2019-10-04 16:00:00+00:00  ZM         False                         0
90  2019-10-04 22:00:00+00:00  ZM         False                         0
96  2019-10-07 04:00:00+00:00  ZM         False                         0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...