I am having issues with joins in pandas and I am trying to figure out what is wrong.
Say I have a dataframe
x:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close 1941 non-null values
high 1941 non-null values
low 1941 non-null values
open 1941 non-null values
dtypes: float64(4)
should I be able to join it with y on index with a simple join command where y = x except colnames have +2.
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close2 1941 non-null values
high2 1941 non-null values
low2 1941 non-null values
open2 1941 non-null values
dtypes: float64(4)
y.join(x) or pandas.DataFrame.join(y,x):
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 34879 entries, 2004-12-16 00:00:00 to 2012-07-12 00:00:00
Data columns:
close2 34879 non-null values
high2 34879 non-null values
low2 34879 non-null values
open2 34879 non-null values
close 34879 non-null values
high 34879 non-null values
low 34879 non-null values
open 34879 non-null values
dtypes: float64(8)
I expect the final to have 1941 non-values for both. I tried merge as well but I have the same issue.
I had thought the right answer was pandas.concat([x,y]), but this does not do what I intend either.
In [83]: pandas.concat([x,y])
Out[83]: <class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 3882 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close2 3882 non-null values
high2 3882 non-null values
low2 3882 non-null values
open2 3882 non-null values
dtypes: float64(4)
edit:
If you are having issues with join, read Wes's answer below. I had one time stamp that was duplicated.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…