How to manipulate multibyte string in python?

Question

How to manipulate multibyte string in python?

293 views Asked by mathewFarrel At 23 July 2020 at 10:26

I have a log file having multibyte data in it (). I want to write a script that does some data manipulation on it.

with open(fo, encoding="cp1252") as file:
    for line in file:
        print(line)
        if("WINDOWS" in line):
            print(found)

print(line) give following output:

there is one extra byte after every character. This is not working due to the fact that WINDOWS is not multibyte. I am unable to find the solution for this. Can someone help me here ?

Original Q&A

There are 1 answers

**tripleee** · Accepted Answer · 2020-07-23T11:42:46+00:00

cp1252 is not a multibyte encoding. If the file in fact contains UTF-16, but most of it is in the very lowest range of Unicode, using cp1252 will yield roughly the correct characters except there will be zero (null) bytes between them. Without an unambiguous sample of the bytes in the file, we can only speculate; but try opening the file with encoding='utf-16le'. (If this fails, please edit your question to indlude a hex dump or repr() of the binary bytes in the file; see also Problematic questions about decoding errors)

TechQA.

How to manipulate multibyte string in python?

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in STRING-MATCHING

Related Questions in MULTIBYTE

Related Questions in MULTIBYTE-CHARACTERS

Popular Questions

Popular Tags

Trending Questions