I am trying to use pysam.view() to filter out certain alignments from a BAM file. The problem I am facing is how to include several regions in the filter.
pysam.view() emulates the samtools view command which allows one to enter several regions separated by the space character, eg:
samtools view opts bamfile chr1:2010000-20200000 chr2:2010000-20200000
But the corresponding pysam.view call:
pysam.view(ops, bamfile, '1:2010000-20200000 2:2010000-20200000')
does not work. It does not return any alignments. I'm quite sure the problem lies in how to specify the list of regions, since the following command works fine:
pysam.view(ops, bamfile, '1:2010000-20200000')
and returns alignments.
My question is: does pysam.view support multiple regions and how does one specify this list? I have searched for documentation regarding this but not found anything.
(Also note that the number indicating the end of each of your regions is ~10x larger than the beginning - it seems you might have intended
2010000-2020000
instead.)I have tested it using the following code:
However, this way of extracting alignments is not very efficient, as the output you get is one large
str
, instead of individual alignments. To get alist
of separate alignments instead, use the following code:Each element of
alignment
then ends up being apysam.AlignedSegment
object, with which you can work further using the functions in pysam API.