I'm new to Python, and am only starting to use it as part of a CTF challenge (I'm a cybersecurity student). I was given a mostly pre-built "decoder" script, and the assignment was to complete it. So, I have my word list imported into the script as a list variable, I have a while loop running through each word, it's all great...except as soon as a non-ASCII character comes up in the word list (in this case it's the word "Français") I get this error thrown at me:
Traceback (most recent call last):
File "dict.py", line 32, in <module>
cipher = AES.new(secret + (BLOCK_SIZE - len(codecs.encode(secret, 'utf-8' )) % BLOCK_SIZE) * PADDING, AES.MODE_ECB)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
Which would make sense if the code didn't specify utf-8 encoding, plus when I echo $LANG in the terminal I get en_GB.UTF-8...not ascii!
For reference, here's the original code as I received it:
# pip install pycryptodome
from Crypto.Cipher import AES
import base64
BLOCK_SIZE = 32
PADDING = '{'
# Encrypted text to decrypt
encrypted = "9l21XiUohS1j9kUx02KJNAkjw51pupGgiMlCkuVNEMo="
def decode_aes(c, e):
return c.decrypt(base64.b64decode(e)).decode('latin-1').rstrip(PADDING)
secret = "password"
if secret[-1:] == "\n":
print("Error, new line character at the end of the string. This will not match!")
elif len(secret.encode('utf-8')) >= 32:
print("Error, string too long. Must be less than 32 bytes.")
else:
# create a cipher object using the secret
cipher = AES.new(secret + (BLOCK_SIZE - len(secret.encode('utf-8')) % BLOCK_SIZE) * PADDING, AES.MODE_ECB)
# decode the encoded string
decoded = decode_aes(cipher, encrypted)
if decoded.startswith('FLAG:'):
print("\n")
print("Success: "+secret+"\n")
print(decoded+"\n")
else:
print('Wrong password')
and here it is after I've modified it (unchanged portion omitted):
with open('words.txt', 'r') as f:
words = f.readlines()
i = 0
while i < len(words):
secret = words[i][0:-1]
print(secret)
if secret[-1:] == "\n":
print("Error, new line character at the end of the string. This will not match!")
elif len(secret.encode('utf-8')) >= 32:
print("Error, string too long. Must be less than 32 bytes.")
else:
# create a cipher object using the secret
cipher = AES.new(secret + (BLOCK_SIZE - len(secret.encode('utf-8')) % BLOCK_SIZE) * PADDING, AES.MODE_ECB)
# decode the encoded string
decoded = decode_aes(cipher, encrypted)
if decoded.startswith('FLAG:'):
print("\n")
print("Success: "+secret+"\n")
print(decoded+"\n")
break
else:
print('Wrong password')
print(i)
i += 1
This works up until "Français."
I'm running it in Python 2.7. I've tried running it in python3, but then the import of Crypto.Cipher doesn't work (ModuleNotFoundError: No module named 'Crypto'), and I really don't know enough about Python to be able to fix that. I have also tried installing the codecs module and replacing secret.encode('utf-8') with codecs.encode(secret,'utf-8') but I get the same error.
I don't need anyone to solve the rest of this assignment for me. If there are other reasons my code won't work, don't tell me, I'll figure it out. But this encoding thing really doesn't seem like it's meant to be part of the challenge, and I'm stumped as to what to do next. Why is it using ASCII as its default encoding despite arguments to the contrary, and how do I fix it?