I'm trying to understand the RTF 1.9.1 specification document but #PCDATA (text without control words) is confusing me. Below is some sample code to show what I don't understand. Note that the text below is formatted incorrectly. I formatted it to make it look nicer.
{
\fonttbl
{
\f0
\fbidi
\froman
\fcharset0
\fprq2
{
\*
\panose
02020603050405020304
}
Times New Roman;
}
}
The specification says:
If the character is anything other than an opening brace ({), closing brace (}), backslash (\), or a CRLF (carriage return/line feed), the reader assumes that the character is plain text and writes the character to the current destination using the current formatting properties.
If I were to follow the specification above, I would end up writing Times New Roman
to the document. How is a parser supposed to know whether it has encountered #PCDATA or document text?
The answer is on page 9 of the RTF 1.9.1 specification.
In the example I gave in the question, fonttbl is a destination control word meaning the text doesn't appear in the document. On page 11 of the specification a list of example control words that change the destination is given:
There are many more but those are the main ones.