xmltodict 'UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte'

804 views Asked by At

I am trying to read an XML.dump file using 'xmltodict' library with python 3 to make a dictionary from this file. The code I used is like:

import xmltodict

with open('file1.xml.dump') as fd:
    content = fd.read()
    doc = xmltodict.parse(content)

The error that I got is: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Does anyone know what this error is about of this error and how to fix this problem?

I also added encoding='UTF-8' in the with open statement, and I get the same error.

1

There are 1 answers

0
Guillermo GutiƩrrez On

I just stepped into this error.

In my case it was caused by a float assigned to the #text node.

      'field': {
        '@attribute': 'm3',
        '#text': 10.076
      }

The assign it's valid but raises the encoding error.

The most easy fix it's to assign the value in an f string like this:

'field': {
        '@attribute': 'm3',
        '#text': f'{10.076}'
      }

So I would suggest you to review your dictionary and verify that all the root fields are indeed strings.