Unable to read an xls (a password protected file) after decrypting it

145 views Asked by At
decrypted_workbook = io.BytesIO()
with open(file_path, 'rb') as file:
    office_file = msoffcrypto.OfficeFile(file)
    office_file.load_key(password=password)
    office_file.decrypt(decrypted_workbook)

df = pd.read_excel(decrypted_workbook)

I have a password protected xls file which I tried decrypting using msoffcrypto.OfficeFile(). Now when I tried to read the decrypted file, it gives me the following error:

UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 52-53: illegal UTF-16 surrogate

I was expecting to read the decrypted file as a workbook. I tried specifying

df = pd.read_excel(decrypted_workbook)
df = pd.read_excel(decrypted_workbook,encoding = "utf-8") 

I also tried

df = pd.read_excel(decrypted_workbook, encoding = "utf-16")

None of them worked

1

There are 1 answers

3
Mess On

Try like this.

import msoffcrypto
import io
import pandas as pd

decrypted = io.BytesIO()

with open("test_encrypted.xlsx", "rb") as f: # replace with actual file path
    file = msoffcrypto.OfficeFile(f)
    file.load_key(password="test")  # Use password
    try:
        file.decrypt(decrypted)
        # Create an ExcelFile object
        xls_file = pd.ExcelFile(decrypted)

        # Print sheet names to verify
        print(xls_file.sheet_names)

    except msoffcrypto.exceptions.InvalidKeyError:
        print('The password is incorrect..')

If still it does not work, then the file could potentially be corrupted.