I have 2.000+ tables, some with hundreds of lines, that I'm downloading from a web service (of botanical names) and saving to disk for further inspection.
Since some text fields have carriage returns, I decided to quote everything. But some fields have " characters, others have ' characters, so these characters can't be used for quoting (I could try to escape them, but some are already escaped, and this would easily become a mess. I thought it would be easier to use a different quote character). I tried %, only to find that some fields also use this character. So I need something different. I tried ¨ ☺ π and 人, but nothing seems to work! All of them appear correctly on screen (RKWard in Ubuntu 14.04), all are saved correctly with write.table, but NONE can be read with read.table or read.csv. I'm using UTF-8 as fileEncoding. I get the message "invalid multibyte string", even for ☺ (which is ASCII 1st character).
Sys.getlocale(category="LC_ALL")
gives
"LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=pt_BR.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=pt_BR.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=pt_BR.UTF-8;LC_NAME=pt_BR.UTF-8;LC_ADDRESS=pt_BR.UTF-8;LC_TELEPHONE=pt_BR.UTF-8;LC_MEASUREMENT=pt_BR.UTF-8;LC_IDENTIFICATION=pt_BR.UTF-8"
I have tried changing the locale to chinese, to use the 人 (what shouldn't be needed, I guess, since it displays and saves correctly), but also didn't work. I get
OS reports request to set locale to "chinese" cannot be honored
OS reports request to set locale to "Chinese" cannot be honored
OS reports request to set locale to "zh_CN.utf-8" cannot be honored
Now the most strange: if the chinese characters are in the body of data, they're read without problem. It seems they just can't go as quotes!
Any ideas? Thanks in advance.
I'm not sure this is the solution you're looking for, but if I understood correctly you have CR/LF characters in your text which are a problem to read the data as a table. If so, you can use
readLines
which automatically escapes\r
,\n
and\r\n
and then read as a table. For example, consider the filecrlf.txt
:You can use
And then:
Obviously the line breaks are now escaped when printed, otherwise they would actually break the lines.