fails to direct to bowtie2 index when using snakemake

147 views Asked by At

I am trying to execute this snakemake-rule:

rule test_bowtie2:
    input:
        idx=multiext(
            "/home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS",
            ".1.bt2",
            ".2.bt2",
            ".3.bt2",
            ".4.bt2",
            ".rev.1.bt2",
            ".rev.2.bt2"
        ),
        fastq="sample.fq.gz"
    output:
        "sample_bt.bam"
    shell:
        """
        nice -19 /home/ctools/bowtie2-2.4.4/bowtie2 -x {input.idx} --interleaved {input.fastq} |/home/ctools/bin/samtools view -bS -F4 - > {output}
        """

However, i get the error:

(ERR): "/home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS.1.bt2" does not exist or is not a Bowtie 2 index
Exiting now ...
[main_samview] fail to read the header from "-".
[Fri Jun  9 20:12:40 2023]
Error in rule test_bowtie2:
    jobid: 2
    output: sample_bt.bam
    shell:
        
        nice -19 /home/ctools/bowtie2-2.4.4/bowtie2 -x /home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS.1.bt2 /home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS.2.bt2 /home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS.3.bt2 /home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS.4.bt2 /home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS.rev.1.bt2 /home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS.rev.2.bt2 --interleaved sample.fq.gz |/home/ctools/bin/samtools view -bS -F4 - > SRR22474192_bt.bam
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job test_bowtie2 since they might be corrupted:
sample_bt.bam
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

If i instead just pass input:

idx="/home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS"

I get the error:

MissingInputException in line 30 of /Snakefile:
Missing input files for rule test_bowtie2:
/home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS

Can anybody please help me how to pass the index files to the command in a snakemake workflow? I can get the command below to work without problems directly from the commandline:

nice -19 /home/ctools/bowtie2-2.4.4/bowtie2 -x /home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS  --interleaved sample.fq.gz |/home/ctools/bin//samtools view -bS   -F4 - > sample_bt.bam

Thanks and fingers crossed!

1

There are 1 answers

1
dariober On BEST ANSWER
-x {input.idx}

Is not correct because you pass all the index files as input instead of passing only the prefix. Easiest fix may be something like below, although it is not great to hard-code paths but hopefully you get the idea:

rule test_bowtie2:
    input:
        idx=multiext(
            "/home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS",
            ".1.bt2",
            ".2.bt2",
            ".3.bt2",
            ".4.bt2",
            ".rev.1.bt2",
            ".rev.2.bt2"
        ),
        fastq="sample.fq.gz"
    params:
        idx="/home/databases/genomes/Homo_sapiens/rCRS/bowtieindex/rCRS",
    output:
        "sample_bt.bam"
    shell:
        """
        nice -19 /home/ctools/bowtie2-2.4.4/bowtie2 -x {params.idx} --interleaved {input.fastq} \
        |/home/ctools/bin/samtools view -bS -F4 - > {output}
        """