Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
104 views
in Technique[技术] by (71.8m points)

python - Calculate which Tickets would be open if I have a max open at the same time

I have many Dataframes of Tickets with open and close datetime64 columns. This is an example:

df = pd.DataFrame({
    "open": ["03/27/14", "03/28/14", "03/31/14", "04/01/14", "04/02/14", "04/03/14", "04/04/14", "04/07/14", "04/08/14", "04/09/14", "05/10/14", "05/11/14", "05/14/14", "05/15/14", "05/16/14", "05/17/14", "05/21/14", "05/22/14", "05/25/14", "05/26/14"],
    "close": ["04/01/14", "04/02/14", "04/03/14", "04/04/14", "04/07/14", "04/08/14", "04/09/14", "04/10/14", "04/11/14", "04/14/14", "05/15/14", "05/16/14", "05/17/14", "05/24/14", "05/24/14", "05/27/14", "05/27/14", "05/27/14", "05/28/14", "05/29/14"]
    }).astype({
    "open": "datetime64[ns]",
    "close": "datetime64[ns]",
})

I need to know which tickets would be open if I can have maximum n open tickets at the same time.

Note: tickets will be open only in the afternoon and closed only in the morning, so if 1 ticket have same close date as another ticket open, the 2 tickets are consecutives, not overlapped.

For example, with n=1 or n=2 or n=3 should give this result:

    en_date     ex_date     n=1     n=2     n=3
0   03/27/14    04/01/14    True    True    True
1   03/28/14    04/02/14    False   True    True
2   03/31/14    04/03/14    False   False   True
3   04/01/14    04/04/14    True    True    True
4   04/02/14    04/07/14    False   True    True
5   04/03/14    04/08/14    False   False   True
6   04/04/14    04/09/14    True    True    True
7   04/07/14    04/10/14    False   True    True
8   04/08/14    04/11/14    False   False   True
9   04/09/14    04/14/14    True    True    True
10  05/10/14    05/15/14    True    True    True
11  05/11/14    05/16/14    False   True    True
12  05/14/14    05/17/14    False   False   True
13  05/15/14    05/24/14    True    True    True
14  05/16/14    05/24/14    False   True    True
15  05/17/14    05/27/14    False   False   True
16  05/21/14    05/27/14    False   False   False
17  05/22/14    05/27/14    False   False   False
18  05/25/14    05/28/14    True    True    True
19  05/26/14    05/29/14    False   True    True

I could do a for loop but the dfs have millions of rows and it would take ages.
Which is the fastest way to calculate it?

question from:https://stackoverflow.com/questions/65853774/calculate-which-tickets-would-be-open-if-i-have-a-max-open-at-the-same-time

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
n = 2

# Number of previous tickets overlapping
df['over'] = np.sum(np.tril(np.subtract.outer(df['open'].astype(str).str.replace('-', '').astype(int).values,df['close'].astype(str).str.replace('-', '')

def reduce_it(acc, row):
    i = acc[1]
    prev_open = sum([acc[0][i-p] for p in range(1,row+1)])
    
    acc[0][i] = 1 if prev_open < n else 0
    return (acc[0], i+1)

df["n=2"], _ = reduce(reduce_it, df["over"].values, (np.empty(df.shape[0], dtype=np.bool), 0))

NOTE: At the moment this is the fastest way I've found. Happy to change accepted answer if someone find a faster code.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...