I have a DNA sequence, its length for example is m*4n:
B = 'GATTAACTACACTTGAGGCT...';
I have also a vector of real numbers X = {xi, i = 1..m*4n}, and use mod(X,1)
to keep them in the range [0,1]. For example:
X = [0.223 0.33 0.71 0.44 0.91 0.32 0.11 ....... m*4n];
I then need to transform X
into a binary vector by applying the function:
f(x)={0 ,0 < X(i,j) ≤ 0.5; 1 ,0.5 < X(i,j) ≤ 1;)
The output according the previous values will be like X = [0010100 ....]
. If X(i,j)==1
, then B(i,j)
is complemented, otherwise it is unchanged. In this context, the complement is the matching base-pair (i.e. A->T, C->G, G->C, and T->A).
This is the code I've tried so far, with no success:
%%maping X chaotic sequence from real numbers to binary sequence using threshold function
X = v(:,3);
X(257)=[];
disp (X);
mode (X,1);
for i=1
for j=1:256
if ((X(i,j)> 0) && (X(i,j)<= .5))
X(i,j) = 0;
elseif ((X(i,j)> .5) && (X(i,j)<= 1))
X(i,j) = 1;
end
end
end
disp(X);
How can I properly perform the indexing and complement?
Given a sample base-pair sequence stored as a character array:
And a sample vector of numeric values the same length as
B
:Then there is a fairly straightforward solution...
First, your use of the
mod
function implies you want to use only the fractional part of each value inX
. This is how you wold do that:Next, you should give the documentation on vectorization a read. It will teach you that a for loop can be avoided for many operations in MATLAB. In particular, applying a logical test to your vector
X
can be done like so:And
index
is now a logical index the same length asX
with ones (i.e. true) for each value greater than 0.5. You now want to get the characters corresponding to those indices inB
, change them to their complement, then place them back inB
. You can do this using a little trick in MATLAB whereby a character is converted to its ASCII numeric value when used an as index:Notice the characters
'TGCA'
get placed in indices 65, 67, 71, and 84 ofcompMap
(i.e. the ASCII values for'ACGT'
). The rest are blanks. Now, you can replace the indexed base-pairs with their complements by simply doing this:Putting this all together, here's the solution: