Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
89 views
in Technique[技术] by (71.8m points)

python - Merge csv files - long processing time

I have folder with multiple (hundreds to low thousands) csv files, each with about 150 rows and 2 columns, I need to merge them to 1 summary file to later use for plotting of my data. - each file = trajectory curve

At the moment I use pandas concat

df=pd.DataFrame()
for file in folder:
    df_temp = pd.read_csv(file, skiprows = 10, usecols = [2,3])
    df = pd.concat([df, df_temp], ignore_index = True)

Issue I have now is long processing time (10+ minutes) and occasional MemoryError.

Is there some less memory intensive and reasonably way to merge csv files?

question from:https://stackoverflow.com/questions/65951076/merge-csv-files-long-processing-time

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can create list of DataFrames first and then pass to concat:

dfs=[]
for file in folder:
    dfs.append(pd.read_csv(file, skiprows = 10, usecols = [2,3]))
df = pd.concat(dfs, ignore_index = True)

Solution with list comprehension:

dfs = [pd.read_csv(file, skiprows = 10, usecols = [2,3]) for file in folder]
df = pd.concat(dfs, ignore_index = True)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...