I am attempting to merge a number of CSV files. My Initial function is aimed to:
- Look Inside a directory and count the number of files within (assume all are .csv)
- Open the first CSV and append each row into a list
- Clip the top three rows (there's some useless column title info I don't want)
- Store these results in an a list I've called 'archive
- Open the next CSV file and repeat(clip and append em to 'archive')
- When we're out of CSV files I wanted to write the complete 'archive' to a file in separate folder.
So for instance if i were to start with three CSV files that look something like this.
CSV 1
[]
[['Title'],['Date'],['etc']]
[]
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],["Sam doesn't taste as good and the last three"]]
CSV 2
[]
[['Title'],['Date'],['etc']]
[]
[['Dolphin'],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
CSV 3
[]
[['Title'],['Date'],['etc']]
[]
[['Spinach'],['04/01/2013'],['Spinach has lots of iron']]
[['Melon'],['02/06/2013'],['Not a big fan of melon']]
At the end of which I'd home to get something like...
CSV OUTPUT
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],["Sam doesn't taste as good and the last three"]]
[['Dolphin'],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
[['Spinach'],['04/01/2013'],['Spinach has lots of iron']]
[['Melon'],['02/06/2013'],['Not a big fan of melon']]
So... I set about writing this:
import os
import csv
path = './Path/further/into/file/structure'
directory_list = os.listdir(path)
directory_list.sort()
archive = []
for file_name in directory_list:
temp_storage = []
path_to = path + '/' + file_name
file_data = open(path_to, 'r')
file_CSV = csv.reader(file_data)
for row in file_CSV:
temp_storage.append(row)
for row in temp_storage[3:-1]:
archive.append(row)
archive_file = open("./Path/elsewhere/in/file/structure/archive.csv", 'wb')
wr = csv.writer(archive_file)
for row in range(len(archive)):
lastrow = row
wr.writerow(archive[row])
print row
This seems to work... except when I check my output file it seems to have stopped writing at a strange point near the end"
eg:
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],['Sam doesn't taste as good and the last three']]
[['Dolphin],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
[['Spinach'],['04/0
It's really wierd, i can't work out what's gone wrong. Seemed to be writing fine but have decided to stop even half way through a list entry. Tracing it back I'm sure that this has something to do with my last write "for loop", but I'm not too familiar the csv methods. Have has a read through the documentation, and am still stumped.
Can anyone point out where I've gone wrong, how I might fix it and perhaps if there would be a better way of going about all this!
Many Thanks -Huw
See Question&Answers more detail:
os