Saving df into txt file breaks up lines

32 views Asked by At

I have a df of one variable that I want to save as a txt file. The df looks normal, but when I try to save it as a txt file the lines get chopped up, and the word "short" also got in there for some reason (side note: I first load a larger txt file and find the unique values from it, which is what I want to save). Is this something with the encoding? Previously it came out gibberish which is why I added the "ascii". I see that when I paste the lines here they seem fine, but these spaces aren't supposed to be there (there web uiids). Note that this persists even I make it a vector before.

docslist <- read.delim("docs_copyforR.txt") 
short <- unique(docslist) 
save(short, ascii = TRUE, file = "uniquelist.txt")

Example of chopped up txt file:

RDA3
A
3
262401
197888
6
CP1252
1026
1
262153
5
short
787
1
16
13561
262153
36
4d61d186-daf1-4d43-aee7-975332c34d92
262153
36
4d61d186-daf1-4d43-aee7-975332c34d92
262153
36
3f562e5b-3625-4fe2-b841-2ed593c13f7e
262153
36

The "mother" txt file looks fine:

14112553000004
4d61d186-daf1-4d43-aee7-975332c34d92
3f562e5b-3625-4fe2-b841-2ed593c13f7e
392bb1c3-2ab9-4a26-8749-309047e53eaa
1

There are 1 answers

0
Caspar V. On BEST ANSWER

save() is a function to save R objects for later loading by R with load(). It saves not only the data, but also its metadata. E.g. that it is a dataframe, with a specific name ('short'), in a specific encoding ('CP1252'), the names of the headers, that the columns contain text-data, what the row numbers are etc. It is not intended to be human-readable.

There are various functions in R to write various files, like write.table(), write.csv(). There is a reciprocal of the function you've already used:

write_delim(short, file = "uniquelist.txt")

EDIT: since you're reading entire text lines, instead of delimited ones, I'd suggest using read_lines() and write_lines() instead:

docslist <- read_lines("docs_copyforR.txt") 
short <- unique(docslist) 
write_lines(short, file = "uniquelist.txt")

Note that docslist and short will be vectors, not dataframes. This rids of the column header as well.