How should I go about reading a .docx file with Python and being able to recognize the italicized text and storing it as a string?
I looked at the docx python package but all I see is features for writing to a .docx file.
I appreciate the help in advance
Here's what my example document,
TestDocument.docx
, looks like.Note: The word "Italic" is in Italics, but "Emphasis" uses the style, Emphasis.
If you install the
python-docx
module. This is a fairly simple exercise.The
Run.italic
attribute captures whether the text is formatted as Italic, but it doesn't know if a text block has a Style that is rendered in Italic, but it can be detected by checking Run.style.name (if you know what styles in your document are rendered in Italics.