How to stop printing the properties of an .rtf file out when I use print(file.read()) in python

185 views Asked by At

I am new to coding python and have trouble when I print out from a file (only tried from .rtf) as it displays all the file properties. I've tried a variety of ways to code the same thing, but the output is always similar. Example of the code and the output:

opener=open("file.rtf","r")
print(opener.read())
opener.close()
  • The file only contains this:

Camila

Employee

Try it

  • But the outcome is always:
{\rtf1\ansi\ansicpg1252\cocoartf1671\cocoasubrtf600
{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
{\colortbl;\red255\green255\blue255;}
{\*\expandedcolortbl;;}
\margl1440\margr1440\vieww10800\viewh8400\viewkind0
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0

\f0\fs24 \cf0 Camila\
\
Employees\
\
Try it}

Help? How to stop that from happening or what am I doing wrong?

2

There are 2 answers

0
sommervold On BEST ANSWER

The RTF filetype contains more information than just the text, like fonts etc.. Python reads the RTF file as plain text, and therefore includes this information. If you want to get the plain text, you need a module that can translate it, like striprtf

Make sure the module is installed by running this in the commandline:

pip install striprtf

Then, to get your text:

from striprtf.striprtf import rtf_to_text
file = open("file.rtf", "r")
plaintext = rtf_to_text(file.read())
file.close()
0
coderboi On

Use this package https://github.com/joshy/striprtf.

from striprtf.striprtf import rtf_to_text
rtf = "some rtf encoded string"
text = rtf_to_text(rtf)
print(text)