I'm trying to encrypt a string in C#:
static public string Encrypt(char[] a)
{
for (int i = 0; i < a.Length; i++)
{
a[i] -= (char)(i + 1);
if (a[i] < '!')
{
a[i] += (char)(i + 20);
}
}
return new string(a);
}
Now, when I put in this string:
"Qui habite dans un ananas sous la mer?".
The encryption comes out as:
`Psf3c[[ak[3XT`d3d\3MYKWIZ3XSXU3L@?JAMR`
There's an unrecognizable character in there, after the @. I don't know how it got there, and I don't know why.
If I try to decrypt it (using this method:)
static public string Decrypt(char[] a)
{
for (int i = 0; i < a.Length; i++)
{
a[i] += (char)(i + 1);
if ((a[i] - 20) - i <= '!')
{
a[i] -= (char)(i + 20);
}
}
return new string(a);
}
This is the (incorrect) output:
Qui habite dans un ananas sous laamerx.
How do I allow the encryption routine to access unicode characters?
Generally with modern encryption we don't pay attention to the characters (we may not even have any, we might be encrypting a picture or a sound file), we pay attention to the bytes.
You could take the same approach. Get a stream of bytes from the text in a particular encoding (UTF-8 would be a good one) and then do your encryption on that.
The encrypted bytes are then your output. If you need to have something that can be written down you could use base-64 to produce a textual representation.
The encryption still won't be very good, because that's the hard part and for real uses we'd use an established and well-tested encryption scheme, but you'll have a viable approach that won't produce illegal Unicode sequences like non-characters or mis-matched surrogates.