Python 2 vs Python 3 - Encoding

Question

Python 2 vs Python 3 - Encoding

46 views Asked by coincoin22 At 28 September 2023 at 18:22

I have a simple code:

# -*- coding: utf-8 -*-
text = "12É45678"
print(len(text))

See the Upper E with accent

Then when I run from python 2, the result is 9 when I run from python 3, the result is 8

How to obtain 8 in python 2 (native)

Original Q&A

There are 1 answers

**Brian61354270** · Answer 1 · 2023-09-28T18:28:23+00:00

In Python 2, str is a naive sequence of bytes (what we call bytes in Python 3). To interpret arbitrary bytes as unicode codepoints, you need to decode them into a unicode object:

# -*- coding: utf-8 -*-
text = "12É45678"
print(len(text))
print(len(text.decode("utf-8")))

In Python 2, this prints

9
8

See also the Unicode HOWTO from the Python 2 documentation.

TechQA.

Python 2 vs Python 3 - Encoding

There are 1 answers

Related Questions in PYTHON

Related Questions in ENCODING

Related Questions in PYTHON-2.X

Popular Questions

Trending Questions