Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
326 views
in Technique[技术] by (71.8m points)

python - Add missing day rows in stock market data to maintain continuity in pandas dataframe

So I have around 13 years of stock market data of daily low high open close. The problem is the markets are closed sometimes in between and hence Monday to Friday might not appear continuously sometimes. Look below

Date Day Open High Low Close Adjusted Close
0 17-09-2007 Monday 6898 6977.2 6843 6897.1 6897.100098
1 18-09-2007 Tuesday 6921.15 7078.95 6883.6 7059.65 7059.649902
2 19-09-2007 Wednesday 7111 7419.35 7111 7401.85 7401.850098
3 20-09-2007 Thursday 7404.95 7462.9 7343.6 7390.15 7390.149902
4 21-09-2007 Friday 7378.3 7506.35 7367.15 7464.5 7464.5
5 25-09-2007 Tuesday 7658.5 7694.25 7490.2 7629.15 7629.149902
7 26-09-2007 Wednesday 7647.1 7829.85 7591.8 7755.9 7755.899902
8 28-09-2007 Friday 7838.25 8082.85 7836.05 8042.2 8042.200195
9 01-10-2007 Monday 8008.55 8085.15 7913.3 7987.5 7987.5
10 03-10-2007 Wednesday 8029.8 8235.8 7820.25 8097.9 8097.899902
11 04-10-2007 Thursday 8083.3 8086.7 7828.65 8035.9 8035.899902
12 05-10-2007 Friday 8038.1 8066.55 7789.7 7845.25 7845.25
13 08-10-2007 Monday 7853.15 7935.45 7516.45 7626.4 7626.399902
14 09-10-2007 Tuesday 7580.9 7916.45 7535.05 7895.85 7895.850098
15 10-10-2007 Wednesday 7960.65 8081.05 7907.35 8030.65 8030.649902
16 11-10-2007 Thursday 8054.3 8177.75 8005.5 8158.8 8158.799805
17 15-10-2007 Monday 7962.55 8306.35 7962.55 8286.3 8286.299805
18 16-10-2007 Tuesday 8361.4 8491.65 8240.3 8452.2 8452.200195
19 17-10-2007 Wednesday 8071.55 8218.1 7641.5 8099.9 8099.899902
20 18-10-2007 Thursday 8055.15 8192.45 7519.7 7608.75 7608.75
21 19-10-2007 Friday 7637.4 7718.9 7279.3 7423.8 7423.799805

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

process:

  1. set index of the date
  2. resample the dataframe with 'D'
  3. drop the day is Sat and Sun
  4. fillna with 'ffill'
# Data Index should be timeSeries
df['Date'] = pd.to_datetime(df['Date'], format='%d-%m-%Y')

# upsample
dfn = df.set_index('Date').resample('D').asfreq()

# covert Date to Monday .. Sunday
dfn['Day'] = dfn.index.strftime('%A')

# drop Day in ['Saturday', 'Sunday']
cond = dfn['Day'].isin(['Saturday', 'Sunday'])
dfn = dfn[~cond].fillna(method='ffill').reset_index()

output:

        Date        Day     Open     High      Low    Close  Adjusted Close
0  2007-09-17     Monday  6898.00  6977.20  6843.00  6897.10     6897.100098
1  2007-09-18    Tuesday  6921.15  7078.95  6883.60  7059.65     7059.649902
2  2007-09-19  Wednesday  7111.00  7419.35  7111.00  7401.85     7401.850098
3  2007-09-20   Thursday  7404.95  7462.90  7343.60  7390.15     7390.149902
4  2007-09-21     Friday  7378.30  7506.35  7367.15  7464.50     7464.500000
5  2007-09-24     Monday  7378.30  7506.35  7367.15  7464.50     7464.500000
6  2007-09-25    Tuesday  7658.50  7694.25  7490.20  7629.15     7629.149902
7  2007-09-26  Wednesday  7647.10  7829.85  7591.80  7755.90     7755.899902
8  2007-09-27   Thursday  7647.10  7829.85  7591.80  7755.90     7755.899902
9  2007-09-28     Friday  7838.25  8082.85  7836.05  8042.20     8042.200195
10 2007-10-01     Monday  8008.55  8085.15  7913.30  7987.50     7987.500000
11 2007-10-02    Tuesday  8008.55  8085.15  7913.30  7987.50     7987.500000
12 2007-10-03  Wednesday  8029.80  8235.80  7820.25  8097.90     8097.899902
13 2007-10-04   Thursday  8083.30  8086.70  7828.65  8035.90     8035.899902
14 2007-10-05     Friday  8038.10  8066.55  7789.70  7845.25     7845.250000
15 2007-10-08     Monday  7853.15  7935.45  7516.45  7626.40     7626.399902
16 2007-10-09    Tuesday  7580.90  7916.45  7535.05  7895.85     7895.850098
17 2007-10-10  Wednesday  7960.65  8081.05  7907.35  8030.65     8030.649902
18 2007-10-11   Thursday  8054.30  8177.75  8005.50  8158.80     8158.799805
19 2007-10-12     Friday  8054.30  8177.75  8005.50  8158.80     8158.799805
20 2007-10-15     Monday  7962.55  8306.35  7962.55  8286.30     8286.299805
21 2007-10-16    Tuesday  8361.40  8491.65  8240.30  8452.20     8452.200195
22 2007-10-17  Wednesday  8071.55  8218.10  7641.50  8099.90     8099.899902
23 2007-10-18   Thursday  8055.15  8192.45  7519.70  7608.75     7608.750000
24 2007-10-19     Friday  7637.40  7718.90  7279.30  7423.80     7423.799805

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...