Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
257 views
in Technique[技术] by (71.8m points)

python - How to get an ordered count of month names from a datetime index

I have a dataframe called WorkOrders that looks like this

enter image description here

what I want to do is to convert the format of the timestamp in the index so the index reads as 2018-Feb-27 10:47:00.

Then I can call a method to count the number of rows each month has. The following method using a lambda expression and group by returns

enter image description here

This is essentially what I am trying to do except the 1 becomes Jan, 2 becomes Feb ect. Using the .month_name() method works, but it is no longer in chronological order e.g.

enter image description here

This is what I would like to see except in the order of January, February, March ....

question from:https://stackoverflow.com/questions/65837930/how-to-get-an-ordered-count-of-month-names-from-a-datetime-index

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  • The desired format, '2018-Feb-27 10:47:00', is a str not a datetime64[ns] dtype, which means df.index.month and df[col].dt.month can't be used to extract the month, because those methods don't work on strings.

Option 1:

  • I think it's easier to use the calendar module (part of the standard library) to get a list of month names, and then map the month number to the month name.
import pandas as pd
from calendar

# test dataframe
df = pd.DataFrame({'a': np.random.randint(1, 10, size=(3000))}, index=pd.bdate_range('2021-01-21', freq='D', periods=3000))
df.index.name = 'Call Out Time'

# display(df.head())
               a
Call Out Time   
2021-01-21     3
2021-01-22     8
2021-01-23     7
2021-01-24     2
2021-01-25     9

# groupby month: the index must be a datetime dtype
dfg = df.groupby(df.index.month).size().reset_index(name='counts')

# map the month number to the month name
dfg['Call Out Time'] = dfg['Call Out Time'].map(dict(zip(range(1, 13), calendar.month_name[1:])))

Option 2:

# groupby month_name
dfg = df.groupby(df.index.month_name()).size().reset_index(name='counts')

# set as categorical
dfg['Call Out Time'] = pd.Categorical(dfg['Call Out Time'], categories=calendar.month_name[1:], ordered=True)

# sort the values
dfg = dfg.sort_values('Call Out Time').reset_index(drop=True)

Result of both options:

   Call Out Time  counts
0        January     259
1       February     254
2          March     279
3          April     248
4            May     248
5           June     240
6           July     248
7         August     248
8      September     240
9        October     248
10      November     240
11      December     248

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...