You can read CSV file in chunks using chunksize
parameter and append each chunk to the HDF file:
hdf_key = 'hdf_key'
df_cols_to_index = [...] # list of columns (labels) that should be indexed
store = pd.HDFStore(hdf_filename)
for chunk in pd.read_csv(csv_filename, chunksize=500000):
# don't index data columns in each iteration - we'll do it later ...
store.append(hdf_key, chunk, data_columns=df_cols_to_index, index=False)
# index data columns in HDFStore
store.create_table_index(hdf_key, columns=df_cols_to_index, optlevel=9, kind='full')
store.close()
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…