I'm scraping a website to export the data into a semantic format (n3). However, I also want to perform some data analysis on that data, so having it in a csv format is more convenient.
To get the data in both formats I can do
scrapy spider -t n3 -o data.n3
scrapy spider -t csv -o data.csv
However, this scrapes the data twice and I cannot afford it with big amounts of data.
Is there a way to export the same scraped data into multiple formats? (without downloading the data more than once)
I find interesting to have an intermediate representation of the scraped data that could be exported into different formats. But it seems there is no way to do this with scrapy.
From what I understand after exploring the source code and the documentation,
-t
option refers to theFEED_FORMAT
setting which cannot have multiple values. Also, theFeedExporter
built-in extension (source) works with a single exporter only.Actually, think about making a feature request at the Scrapy Issue Tracker.
As more like a workaround, define a pipeline and start exporting with multiple exporters. For example, here is how to export into both CSV and JSON formats: