I have 2 questions that I would be discuss with you. question 1: I aligned my fastqs to the reference genome and obtained the bam files with an equal number of reads for the pair (being a paired end). Now, what is the number of read1 and read2 aligned for based on the result of bowtie2 to consider there?
Time loading reference: 00:00:22
Time loading forward index: 00:03:17
Time loading mirror index: 00:01:46
Multiseed full-index search: 01:39:15
18049380 reads; of these:
18049380 (100.00%) were paired; of these:
2086895 (11.56%) aligned concordantly 0 times
12269949 (67.98%) aligned concordantly exactly 1 time
3692536 (20.46%) aligned concordantly >1 times
----
2086895 pairs aligned concordantly 0 times; of these:
39989 (1.92%) aligned discordantly 1 time
----
2046906 pairs aligned 0 times concordantly or discordantly; of these:
4093812 mates make up the pairs; of these:
3868173 (94.49%) aligned 0 times
143341 (3.50%) aligned exactly 1 time
82298 (2.01%) aligned >1 times
89.28% overall alignment rate
Time searching: 01:44:41
Overall time: 01:44:47
Second question: I cleaned up duplicates and non-unique reads. I later removed the blacklists and unChromosomes from the bam. now the two pairs (read1 and read2) are of different numbers, as expected, because the blacklists can be on one pair and not on the other, just like the chromosomes. could this be a problem, or can I keep the bam file with a different number of reads per pair for the next peak calling??