I am trying to use pysam.view() to filter out certain alignments from a BAM file. The problem I am facing is how to include several regions in the filter.
pysam.view() emulates the samtools view command which allows one to enter several regions separated by the space character, eg:
samtools view opts bamfile chr1:2010000-20200000 chr2:2010000-20200000
But the corresponding pysam.view call:
pysam.view(ops, bamfile, '1:2010000-20200000 2:2010000-20200000')
does not work. It does not return any alignments. I'm quite sure the problem lies in how to specify the list of regions, since the following command works fine:
pysam.view(ops, bamfile, '1:2010000-20200000')
and returns alignments.
My question is: does pysam.view support multiple regions and how does one specify this list? I have searched for documentation regarding this but not found anything.
(Also note that the number indicating the end of each of your regions is ~10x larger than the beginning - it seems you might have intended
2010000-2020000instead.)I have tested it using the following code:
However, this way of extracting alignments is not very efficient, as the output you get is one large
str, instead of individual alignments. To get alistof separate alignments instead, use the following code:Each element of
alignmentthen ends up being apysam.AlignedSegmentobject, with which you can work further using the functions in pysam API.