I solved this by amending my item pipeline and not using the feed exporter in the command line. This allowed me to use close_spider to write, if no results, the header.
I'd obviously welcome any improvements if I've missed something.
Pipeline code:
from scrapy.exceptions import DropItem
from scrapy.exporters import CsvItemExporter
import csv
from Generic.items import GenericItem
class GenericPipeline:
def __init__(self):
self.emails_seen = set()
self.files = {}
def open_spider(self, spider):
self.file = open("results.csv", 'wb')
self.exporter = CsvItemExporter(self.file)
self.exporter.start_exporting()
def close_spider(self, spider):
if not self.emails_seen:
header = GenericItem()
header["email"] = "None Found"
self.exporter.export_item(header)
self.exporter.finish_exporting()
self.file.close()
def process_item(self, item, spider):
if item["email"] in self.emails_seen:
raise DropItem(f"Duplicate item found: {item!r}")
else:
self.emails_seen.add(item["email"])
self.exporter.export_item(item)
return item
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…