Can I use snakemake with humans-in-the-loop?

52 views Asked by At

I am very curious about snakemake but I'm not sure it fits my use case, because I have humans in the loop.

My process is something like this:

  1. Start with a baseline binary classification model
  2. Generate 100 examples near the margin (predicted probability near 0.5)
  3. Have humans label those 100 examples.
  4. Add the 100 examples to the data set and retrain.
  5. Goto step 1.

Thus, it's a form of active learning with humans-in-the-loop

Is snakemake a good fit for this? Or is the human-in-the-loop confounding the principle of reproducibility? If I should use snakemake, are there any relevant pointers for something similar?

1

There are 1 answers

0
ning On

You can achieve this by imagining each loop as a distinct Snakemake output:

rule generate_example:
    output: "examples/{iter}.tsv"
    input: "model/{iter}.tsv"
    wildcard_constraints: iter = "\d+"

rule build_baseline_model:
    output: "model/0.tsv"

rule build_subsequent_model:
    output: "model/{iter}.tsv"
    input: lambda wc: expand("examples-labelled/{iter}.tsv", iter=range(0, wc.iter)
    wildcard_constraints: iter = "[1-9]\d*"  # not 0

So, yes, I think Snakemake is a good fit for your process because it can represent it with reproducibility at and for each loop.