Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
429 views
in Technique[技术] by (71.8m points)

python - How to convert a column string of hh:mm to a time format

enter image description here

I am trying to read the column total_time and sum up the times where the user is the same meaning 'name 1' = 03:30 + 04:00 and so on, I am trying to use pandas and I have manage to get the data separated, but when I try to take the total_time, as this is an object, I can not just sum it up, so I have tried to change the format of this object to a datetime object, but I haven't been able to do it.

format_time="%H:%M"

user1 = df.loc[df['user']== "name 1"]

user1['hora']=datetime.strptime((user1['total_time']),format_time)

I have tried this and it gives me an error

AttributeError: module 'datetime' has no attribute 'strptime'
question from:https://stackoverflow.com/questions/65853678/how-to-convert-a-column-string-of-hhmm-to-a-time-format

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  • Aside from the fact that the .strptime is being accessed from the incorrect namespace, it should be datetime.datetime.strptime, the implementation will result in an error (TypeError: strptime() argument 1 must be str, not Series), even with the correct function call.
    • pandas already has built-in vectorized datetime features.
  • In this case, the hours and minutes in the 'total_time column, represent an absolute amount of time, so the column should be converted with pandas.to_timedelta.
    • This function takes a string in the form 'hh:mm:ss', so df.tt + ':00' is used to get the 'total_time' ('tt') column into the correct format.
import pandas as pd

# test dataframe
df = pd.DataFrame({'user': ['n1', 'n2', 'n3', '', 'n1', 'n2'],
                   'ts': ['09:30', '09:30', '09:30', '', '14:00', '10:00'],
                   'te': ['14:00', '10:00', '12:00', '', '18:00', '18:00'],
                   'ls': ['12:00', '', '', '', '', '13:00'],
                   'le': ['13:00', '', '', '', '', '14:00'],
                   'tt': ['03:00', '00:30', '02:30', '00:00', '04:00', '07:00']})

# display(df)
  user     ts     te     ls     le     tt
0   n1  09:30  14:00  12:00  13:00  03:00
1   n2  09:30  10:00                00:30
2   n3  09:30  12:00                02:30
3                                   00:00
4   n1  14:00  18:00                04:00
5   n2  10:00  18:00  13:00  14:00  07:00
# fill blanks with nan
df = df.replace('', np.nan)

# drop all rows that are all nan, except total_time
df = df.dropna(axis=0, how='all', subset=df.columns[:-2])

# display(df)
  user     ts     te     ls     le     tt
0   n1  09:30  14:00  12:00  13:00  03:00
1   n2  09:30  10:00    NaN    NaN  00:30
2   n3  09:30  12:00    NaN    NaN  02:30
4   n1  14:00  18:00    NaN    NaN  04:00
5   n2  10:00  18:00  13:00  14:00  07:00
# convert total_time (tt) to a timedelta
df.tt = pd.to_timedelta(df.tt + ':00')

# groupby user and sum the tt column
total_hours = df.groupby('user').tt.sum().reset_index(name='total_hours')

# display(total_hours)
  user     total_hours
0   n1 0 days 07:00:00
1   n2 0 days 07:30:00
2   n3 0 days 02:30:00

# to have total_hours as a float
total_hours.total_hours = total_hours.total_hours / pd.to_timedelta(1, 'h')

# display(total_hours)
  user  total_hours
0   n1          7.0
1   n2          7.5
2   n3          2.5

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...