print text from .rtf document consisting of blocks as shown in the photo

35 views Asked by At

I have a rtf document, and I need to output the entire text from it, but the text is divided into blocks, and there are actually no characters there. when trying to use standard python libraries, characters in the wrong encoding are output, and when utf-8 encoding is set, the program crashesexample of a block

def extract_text_from_rtf_file(rtf_file_path):
    with open(rtf_file_path, 'r', encoding='latin-1') as file:
        rtf_content = file.read()
        plain_text = rtf_content.replace('\\', '').replace('{', '').replace('}', '')
        return plain_text

rtf_file_path = 'D:/xxxxxxxxxxxxxxxxxxxxxx.rtf'
rtf_text = extract_text_from_rtf_file(rtf_file_path)
print(rtf_text)
0

There are 0 answers