Snakemake touch command

1.9k views Asked by At

I use snakemake and I tried to write a work-flow on alignment and create bigwig. I would like to introduce a creation after star alignment on a file to use for gave me a way to run the wig generation only after all the samples are aligned.

I have this error:

  snakemake --core 3 --configfile config_tardis.yml  -np
RuleException in line 40 of /home/centos/rna_test/rules/star2.rules:
Could not resolve wildcards in rule star_map:
sample

I tried to use this code:

rule star_map:
    input:
        dt="trim/{sample}/",
        forward_paired="trim/{sample}/{sample}_forward_paired.fq.gz",
        reverse_paired="trim/{sample}/{sample}_reverse_paired.fq.gz",
        forward_unpaired="trim/{sample}/{sample}_forward_unpaired.fq.gz",
        reverse_unpaired="trim/{sample}/{sample}_reverse_unpaired.fq.gz",
        t1p="database.done",
    output:
        out1="ALIGN/{sample}/Aligned.sortedByCoord.out.bam",
        out2=touch("Star.align.done")
    params:
        genomedir = config['references']['basepath'],
        sample=config["samples"],
        platform_unit=config['platform'],
        cente=config['center']
    threads: 12
    log: "ALIGN/log/{params.sample}_star.log"
    shell:
        'STAR --runMode alignReads  --genomeDir {params.genomedir} '
        r' --outSAMattrRGline  ID:{params.sample} SM:{params.sample} PL:{config[platform]}  PU:{params.platform_unit} CN:{params.cente} '
        '--readFilesIn   {input.forward_paired} {input.reverse_paired} {input.forward_unpaired} {input.reverse_unpaired} \
       --readFilesCommand zcat \
       --outStd Log \
       --outSAMunmapped Within \
       --outSAMtype BAM SortedByCoordinate \
       --runThreadN  {threads} --outFileNamePrefix  {output.out1};{output.out2}  2> {log} '




rule star_wigg_file:
    input:
        f1= "ALIGN/{sample}/Aligned.sortedByCoord.out.bam",
        t1p="Star.align.done",
    output:
        "ALIGN/{sample}/wiggle/"
    threads: 12

    shell:
       'STAR --runMode inputAlignmentsFromBAM -inputBAMfile {input.f1} --outWigType wiggle \
  --outWigStrand Stranded '

So, the problems seem associated on the introduce of touch

1

There are 1 answers

2
TBoyarski On

You have not provided any mechanism for Snakemake to being determining the value of {sample}. At the furthest most output of Snakemake, is an explicit string. Snakemake needs this string to compare and try to match the patterns of each rule's output(s).

I like to define {sample}, as an expand, as the input of the rule all, as per the author's suggestion.

  • It allows for the preservation of the core code-base (I.E. nothing is being constantly edited in the code you are running over and over again, or trying to preserve for auditing and reproducibility).
  • It also provides a template for when comparing the amongst rules; if they all contain the word "{sample}", comparing inputs and outputs across rules will be easier, they should be identical. It's makes it much easier for others to notice your two rules are linked without running the code.

Add something like this to the top of your current file, and define a list of samples "sampleLIST" in config_tardis.yml:

In the Snakefile as the top-most rule:

rule all:
    input:
        expand("ALIGN/{sample}/wiggle", sample=config["sampleLIST"])

In the configuration file config_tardis.yml add:

sampleLIST: ['Patient1','Patient2','Patient3']

On a side note, as per the comments. I'm also interested in addressing the use of touch. The intent being that a pipeline design should rely on Snakemake's dependency determination for real output files. The file "Star.align.done" seem's like a proxy file, if so, there might be another way without it.