Pull Alignment Character Position

481 views Asked by At

I use pairwise align to get the following:

> alignment <-pairwiseAlignment(pattern = canonical.protein, subject=protein.extracted)
> alignment
Global PairwiseAlignedFixedSubject (1 of 1)
pattern: [448]          DDWEIPDGQITVGQRIGSGSFGTVYKGKWHGDVAVKMLNVTAPTPQQLQAFKNEVGV...FMVGRGYLSPDLSKVRSNCPKAMKRLMAE  CLKKKRDERPLFPQILASIELLARSLPK 
subject:   [1]     DDWEIPDGQITVGQRIGSGSFGTVYKGKWHGDVAVKMLNVTAPTPQQLQAFKNEVGV...FMVGRGYLSPDLSKVRSNCPKAMKRLMAECLKKKRDERPLFPQILASIELLARSLPK 
score: -912.3752 

I can then use:

toString(pattern(alignment))
toString(subject(alignment)) 

to get the full string sequence for both the pattern and the subject. However, how do I get the number 448 and 1 out of the object as an integer? I need to use these numbers but there doesn't seem to be a way to get at them.

2

There are 2 answers

0
Martin Morgan On BEST ANSWER

I believe these are the starts of the alignments, so

start(pattern(alignment))

Your question would be clearer with a fully reproducible example, e.g.,

library(Biostrings)
example(pairwiseAlignment)
aln <- pairwiseAlignment(AAString("PAWHEAE"), AAString("HEAGAWGHEE"),
    substitutionMatrix = "BLOSUM50", gapOpening = 0, gapExtension = -8)

Then

> aln
Global PairwiseAlignedFixedSubject (1 of 1)
pattern: [1] PA--W-HEAE
subject: [2] EAGAWGHE-E
score: 1
> start(subject(aln))
[1] 2

Also, the Bioconductor mailing list is more appropriate for these questions; no subscription required.

0
Niek de Klein On

Since you can make a string out of the alignment you can use R's string functions. You can do substr(toString(pattern(alignment)), 448, 448) to get the 448th character. I'm not familiar with that library so there might be an inbuilt way that I don't know of. See http://www.statmethods.net/management/functions.html for string functions in R.