IDNA Encode Adding Apostrophes and letter B?

54 views Asked by At

I am using the IDNA library to encode/decode unicide domain names but when I encode a domain name, it adds apostrophes either side of the string and prepends the letter b?

For example:

import idna
print(idna.encode('español.com'))

Output: b'xn--espaol-zwa.com'

Expected output: xn--espaol-zwa.com

I feel like I'm missing something really obvious but not sure how to get to the bottom of this.

My expected output is reinforced by the fact if I decode it:

print(idna.decode('xn--espaol-zwa.com'))

I get the original domain: español.com

1

There are 1 answers

0
Mr Fett On BEST ANSWER

For any newbies like me looking for a simple solution to this, as @Barmer has pointed out, the IDNA package outputs a byte string even if you feed in a character string.

If you want a string, you can chain UTF-8 decoding thus:

idna.encode('español.com').decode('utf-8')

Outputs a character string of : xn--espaol-zwa.com

idna.decode will correctly decode this back to español.com without any further treatment needed.