pandas - How to write a large csv file to hdf5 in python?

Question

Welcome To Ask or Share your Answers For Others

pandas - How to write a large csv file to hdf5 in python?

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:51:26+0000

You can read CSV file in chunks using chunksize parameter and append each chunk to the HDF file:

hdf_key = 'hdf_key'
df_cols_to_index = [...] # list of columns (labels) that should be indexed
store = pd.HDFStore(hdf_filename)

for chunk in pd.read_csv(csv_filename, chunksize=500000):
    # don't index data columns in each iteration - we'll do it later ...
    store.append(hdf_key, chunk, data_columns=df_cols_to_index, index=False)
    # index data columns in HDFStore

store.create_table_index(hdf_key, columns=df_cols_to_index, optlevel=9, kind='full')
store.close()

Categories

pandas - How to write a large csv file to hdf5 in python?

pandas - How to write a large csv file to hdf5 in python?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags