The use case as follows:
- I scrape data from a bigger database (only read access) on a fixed schedule and it takes roughly 30mins to 1 hour
- the resulted table will always have >20k rows, the data can be grouped by a dataset_id column which is constricted to 4 values like an enum
- when I query all of the rows of one dataset_id let's say
SELECT * FROM db WHERE dataset_id = A
it is essential that all the records are from the same scrape (so I shouldn't have mixed data from different scrapes)
The question is how would I persist the old data until the new scrape is finished and only then switch to get the new data while deleting the old scrape?
I have thought of the following option:
- have 2 tables and switch between them when a newer scrape is finished
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…