I'm currently using drake
to run a set of >1k simulations. I've estimated that it would take about two days to run the complete set, but I also expect my computer to crash at any point during that period because, well, it has.
Apparently stopping the plan discards any targets that were already built so essentially this means I can't use drake
for its intended purpose.
I suppose I could make a function that actually edits the R file where the plan is specified in order to make drake
sequentially add targets to its cache but that seems utterly hackish.
Any ideas on how to deal with this?
EDIT: The actual problem seems to come from using set.seed
inside my data generating functions. I was aware that drake
already does this for the user in a way that ensures reproducibility, but I figured that if I just left my functions the way they were it wouldn't change anything since drake
would be ensuring that the random seed I chose always ends up being the same? Guess not, but since I removed that step things are caching fine so the issue is solved.
To bring onlookers up to speed, I will try to spell out the problem. @zipzapboing, please correct me if my description is off-target.
Let's say you have a script that generates a
drake
plan and executes it.Created on 2018-11-12 by the reprex package (v0.2.1)
The second
make()
worked just fine, right? But if you were to run the same script in a different session, you would end up with a different plan. The randomly-generatedseed
arguments tosimulate_data()
would be different, so all your targets would build from scratch.Created on 2018-11-12 by the reprex package (v0.2.1)
One solution is to be extra careful to hold onto the same
plan
. However, there is an even easier way: just letdrake
set the seeds for you.drake
automatically gives each target its own reproducible random seed. These target-level seeds are deterministically generated by a root seed (theseed
argument tomake()
) and the names of the targets.Created on 2018-11-12 by the reprex package (v0.2.1)
I really should write more in the manual about how seeds work in
drake
and highlight the original pitfall raised in this thread. I doubt you are the only one who struggled with this issue.