Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
85 views
in Technique[技术] by (71.8m points)

python - Accessing dataframe with multiindex with date intermittently failing

I have a dataframe with a multiindex from which I am attempting access a row from. The dataframe can be recreated with the following:

from datetime import timedelta, date
import pandas as pd
import pytz
from pandas import Timestamp

utc = pytz.UTC

data = {
    "date": [
        Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).replace(minute=59, second=59, microsecond=999999),
        Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date(),
        Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date(),
        Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date() + timedelta(days=1),
        Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date() + timedelta(days=1),
        Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date() + timedelta(days=2),
        Timestamp("2020-06-03 15:00:00").replace(tzinfo=utc).date() + timedelta(days=2),
    ],
    "status": ["in_progress", "in_progress", "done", "in_progress", "done", "in_progress", "done"],
    "issue_count": [20, 18, 2, 14, 6, 10, 10],
    "points": [100, 90, 10, 70, 30, 50, 50],
    "stories": [0, 0, 0, 0, 0, 0, 0],
    "tasks": [100, 100, 100, 100, 100, 100, 100],
    "bugs": [0, 0, 0, 0, 0, 0, 0],
    "subtasks": [0, 0, 0, 0, 0, 0, 0],
    "assignee": ["Name", "Name", "Name", "Name", "Name", "Name", "Name"],
}
df = pd.DataFrame(data)

breakdown = df.groupby(["date", "status"]).sum()
d = date(2020, 6, 3)
done = breakdown.loc[d, "done"]

The line:

done = breakdown.loc[d, "done"]

fails intermitently on about 1/10th of runs with the following stack trace:

...
  File "/Users/<name>/miniconda3/envs/<repo>/lib/python3.9/site-packages/pandas/core/indexes/multi.py", line 2979, in _get_level_indexer
    i = level_codes.searchsorted(code, side="left")
TypeError: '<' not supported between instances of 'int' and 'slice'

I tried debugging the code and saw that on the failing line, level_codes.searchsorted(code, side="left"), was a slice instead of an int so the search was failing but I do not know why this is occuring and why only intermittently.

question from:https://stackoverflow.com/questions/65927344/accessing-dataframe-with-multiindex-with-date-intermittently-failing

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...