Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
108 views
in Technique[技术] by (71.8m points)

python - Concatenate data in CSV files with overlapping data in columns

I have a couple CSV files that have vaccine data, such as this:

File 1

Entity,Code,Date,people_vaccinated
Wisconsin,,2021-01-12,125895
Wisconsin,,2021-01-13,125895
Wisconsin,,2021-01-14,135841
Wisconsin,,2021-01-15,151387
Wisconsin,,2021-01-19,188144
Wisconsin,,2021-01-20,193461
Wisconsin,,2021-01-21,204746
Wisconsin,,2021-01-22,221067
Wisconsin,,2021-01-23,241512
Wisconsin,,2021-01-24,260664
Wyoming,,2021-01-12,13577
Wyoming,,2021-01-13,14406
Wyoming,,2021-01-14,17310
Wyoming,,2021-01-15,19931
Wyoming,,2021-01-19,24788
Wyoming,,2021-01-20,25841
Wyoming,,2021-01-21,25841
Wyoming,,2021-01-22,29993
Wyoming,,2021-01-23,32746
Wyoming,,2021-01-24,35868

File 2

Entity,Code,Date,people_fully_vaccinated
Wisconsin,,2021-01-12,11343
Wisconsin,,2021-01-13,11343
Wisconsin,,2021-01-15,17108
Wisconsin,,2021-01-19,23641
Wisconsin,,2021-01-20,27312
Wisconsin,,2021-01-21,32268
Wisconsin,,2021-01-22,37901
Wisconsin,,2021-01-23,42229
Wisconsin,,2021-01-24,45641
Wyoming,,2021-01-12,2116
Wyoming,,2021-01-13,2559
Wyoming,,2021-01-15,2803
Wyoming,,2021-01-19,3242
Wyoming,,2021-01-20,3441
Wyoming,,2021-01-21,3441
Wyoming,,2021-01-22,4515
Wyoming,,2021-01-23,4773
Wyoming,,2021-01-24,4895

Not all the data (specifically dates going with locations) overlaps, but for the ones that do, how would I combine the last column? I'm guessing using pandas would be best, but I don't want to get stuck messing with a bunch of nested loops.

question from:https://stackoverflow.com/questions/65887182/concatenate-data-in-csv-files-with-overlapping-data-in-columns

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you are trying to merge file1 with file2 only for the records in file1 then solution:

import pandas as pd
## suppose file1_df and file2_df are related Dataframe object for file1 and file2 respectively.
merged_df = pd.merge(file1_df, file2_df, how='left' on=['Entity','Code','Date'])

Note: if you are familiar with set operations, you can do right outer joint, left joint, inner joint, and full outer join changing how parameter in the above function call. reference


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...