Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
180 views
in Technique[技术] by (71.8m points)

python - Date Time Series wise grouping of data and distribution

I am trying the merge the datetime series with a repository data while grouping by name and summing the values.

File1.csv 

Timeseries,Name,count
07/03/2015 06:00:00,Paris,100
07/03/2015 06:00:00,Paris,600
07/03/2015 06:00:00,Paris,700
07/03/2015 06:00:00,London,200
07/03/2015 06:00:00,London,100
07/03/2015 06:00:00,London,500
07/03/2015 06:00:00,Dublin,300
07/03/2015 06:00:00,Dublin,400
07/03/2015 06:00:00,Dublin,400

Output

Master_file.csv (append mode)

    Name,Timeseries(n-1)Timeseries(n)#put the datetime series as header and put       
    Paris,300,1400      #Sum of all the values with same Name
    London,200,800
    Dublin,400,1100

Program 

import pandas as pd 
import numpy as np

df = pd.read_csv('/home/lat_lon1.csv')
df1 = pd.read_csv('/home/lat_lon_master.csv')


gp = df.groupby('Name')['date timeseries'].sum().reset_index() 
df1.merge(gp, on='Name')

I am having trouble in changing the date time column to header and putting the correct values under. Those Names not found can be given NAN and replaced in next iterations.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Please check the python pandas Data Frame documentation Click here Here is the code you are looking at.

Output

Timeseries Name count 07/03/2015 06:00:00 Dublin 1100 07/03/2015 06:00:00 London 800 07/03/2015 06:00:00 Paris 1400

   #!/bin/python
    import pandas as pd
    import numpy as np
    df=pd.read_csv('/home/saiharsh/Documents/Crowd Street/Transition_Data/Telecom_7.csv') #Please enter the file Location
    gp=df.groupby('Name').sum().reset_index()
    flag=0
    for i in gp['Name']:
        if flag==1:
            time=df['Timeseries'][df['Name']==i]
            time=time.tail(1)
            frames=[time1,time]
            time1=pd.concat(frames)
        else:
            time1=df['Timeseries'][df['Name']==i]
            time1=time1.tail(1)
            flag=1
    time1=time1.reset_index(drop=True)
    result=pd.concat([time1,gp],axis=1,join='inner')
    result=result.to_csv(index=False)
    print result

Please feel free to reply if any problem.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...