Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
347 views
in Technique[技术] by (71.8m points)

python - Pandas join/merge/concat two dataframes

I am having issues with joins in pandas and I am trying to figure out what is wrong. Say I have a dataframe x:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close    1941  non-null values
high     1941  non-null values
low      1941  non-null values
open     1941  non-null values
dtypes: float64(4)

should I be able to join it with y on index with a simple join command where y = x except colnames have +2.

 <class 'pandas.core.frame.DataFrame'>
 DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
 Data columns:
 close2    1941  non-null values
 high2     1941  non-null values
 low2      1941  non-null values
 open2     1941  non-null values
 dtypes: float64(4)

 y.join(x) or pandas.DataFrame.join(y,x):
 <class 'pandas.core.frame.DataFrame'>
 DatetimeIndex: 34879 entries, 2004-12-16 00:00:00 to 2012-07-12 00:00:00
 Data columns:
 close2    34879  non-null values
 high2     34879  non-null values
 low2      34879  non-null values
 open2     34879  non-null values
 close     34879  non-null values
 high      34879  non-null values
 low       34879  non-null values
 open      34879  non-null values
 dtypes: float64(8)

I expect the final to have 1941 non-values for both. I tried merge as well but I have the same issue.

I had thought the right answer was pandas.concat([x,y]), but this does not do what I intend either.

In [83]: pandas.concat([x,y]) 
Out[83]: <class 'pandas.core.frame.DataFrame'> 
DatetimeIndex: 3882 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00 
Data columns: 
close2 3882 non-null values 
high2 3882 non-null values 
low2 3882 non-null values 
open2 3882 non-null values 
dtypes: float64(4) 

edit: If you are having issues with join, read Wes's answer below. I had one time stamp that was duplicated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Does your index have duplicates x.index.is_unique? If so would explain the behavior you're seeing:

In [16]: left
Out[16]: 
            a
2000-01-01  1
2000-01-01  1
2000-01-01  1
2000-01-02  2
2000-01-02  2
2000-01-02  2

In [17]: right
Out[17]: 
            b
2000-01-01  3
2000-01-01  3
2000-01-01  3
2000-01-02  4
2000-01-02  4
2000-01-02  4

In [18]: left.join(right)
Out[18]: 
            a  b
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...