To use such exporter you need to create your own Item pipeline that will process your spider output. Assuming that you have simple case and you want to have all spider output in one file this is pipeline you should use (pipelines.py
):
from scrapy import signals
from scrapy.contrib.exporter import CsvItemExporter
class CSVPipeline(object):
def __init__(self):
self.files = {}
@classmethod
def from_crawler(cls, crawler):
pipeline = cls()
crawler.signals.connect(pipeline.spider_opened, signals.spider_opened)
crawler.signals.connect(pipeline.spider_closed, signals.spider_closed)
return pipeline
def spider_opened(self, spider):
file = open('%s_items.csv' % spider.name, 'w+b')
self.files[spider] = file
self.exporter = CsvItemExporter(file)
self.exporter.fields_to_export = [list with Names of fields to export - order is important]
self.exporter.start_exporting()
def spider_closed(self, spider):
self.exporter.finish_exporting()
file = self.files.pop(spider)
file.close()
def process_item(self, item, spider):
self.exporter.export_item(item)
return item
Of course you need to remember to add this pipeline in your configuration file (settings.py
):
ITEM_PIPELINES = {'myproject.pipelines.CSVPipeline': 300 }
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…