Many years ago, I made a program in C# on Windows which "encrypts" text files using (what I thought was) caeser chipher.

Back then I wanted more characters than just A-Z,0-9 and made it possible but never thought about the actual theory behind it.

Looking at some of the files, and comparing it to this website, it seems like the UTF-8 is being shifted.

I started up a Windows VM (because I'm using Linux now) and typed this: abcdefghijklmnopqrstuvwxyz

It generated a text that looks like this in hexadecimals (Shifted 15 times):

70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f c280 c281 c282 c283 c284 c285 c286 c287 c288 c289

How can I shift the hexadecimals to look like this?

61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a

Or are there any easier/better methods of doing this?


I'm using Python 3.5.3, and this is the code I have so far:

import sys

arguments = sys.argv[1:]
file = ""

for arg in arguments:
    if arg[0] != "-":
        file = arg

lines = []
with open(file) as f:
    lines = f.readlines()

for line in lines:
    result = 0
    for value in list(line):
        #value = "0x"+value
        #result = result + temp
    print (result)

Unfortunately, I don't have the C# source code available for the moment. I can try to find it

3 Answers

wovano On Best Solutions

Assuming your input is ASCII text, the simplest solution is to encode/decode as ASCII and use the built-in methods ord() and chr() to convert from character to byte value and vice versa.

Note that the temp value cannot be less than 0, so the second if-statement can be removed.

NB: This is outside the scope of the question, but I also noticed that you're doing argument parsing yourself. I highly recommend using argparse instead, since it's very easy and gives you a lot extra for free (i.e. it performs error checking and it prints a nice help message if you start your application with '--help' option). See the example code below:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument(dest='filenames', metavar='FILE', type=str, nargs='+',
                    help='file(s) to encrypt')
args = parser.parse_args()

for filename in args.filenames:
    with open(filename, 'rt', encoding='ascii') as file:
        lines = file.readlines()
    for line in lines:
        result = ""
        for value in line:
            temp = ord(value)  # character to int value
            temp += 15
            if temp > 0x7a:
                temp -= 0x7a
            result += chr(temp)  # int value to character
Ali Nuri ┼×eker On

You can use int('somestring'.encode('utf-8').hex(),16) to get the exact values on that website. If you want to apply the same rules to each character, you can do it in a character list. You can use

import codecs

def myencode(character,diff):
    return result

diff should be the shift for the cipher (It could be an integer). encode('utf-8') converts string to byte array and .hex() displays bytes as hex. You should feed this function only one character of a string at a time so there would be no issues shifting everything.

After you are done with the encoding you need to decode it in to a new character which you can do by library codecs to convert from integer to byte (char) and then return it back to a string with decode("utf-8")

Edit: Updated, now it works.

GiraffeMan91 On

You can convert hex back and forth between integers and hex using int() and hex(). However, the hex() method only works on integers. So first you need to convert to an integer using base=16.

hex_int = int(hex_str, 16)
cipher = hex_int - 15
hex_cipher = hex(cipher)

Now apply that in a loop and you can shift your results left or right as desired. And you could of course condense the code as well.

result = hex(int(hex_string, 16) - 15)

#in a loop
hexes = ['70', '71', 'c280']
ciphered = []
for n in hexes:
    ciphered.append(hex(int(n, 16) - 15))