Multibyte Delimiter in R

105 views Asked by At

I am having trouble reading a .txt file with the delimiter of "xqz", using read_file or read_delim returns back "invalid 'sep' value: must be one byte".

Is this "xqz" a known delimiter that I am just unfamiliar with? This is a very large data set, and I think uses "," "." "/" " " in the data itself, I so I understand why those were not used as delimiters.

Any tips for either reading multibyte delimiters or converting to single byte delimiter?

Read_file and Read_delim with sep = "xqz"

The data has sensitive information in it so I have made a fake version:

NAMExqzPLACExqzCOLORxqzTIMExqzDIRECTIONxqzSERVICE
JIM     xqz1101xqzREDxqz1200xqzWESTxqzSurgery
RALPH   xqz2201xqzBLUxqz1201xqzNORTxqzObservation
JEAN    xqz3301xqzGRExqz1202xqzSOUTxqzMedical
1

There are 1 answers

0
Emmanuel Hamel On

You can consider the following approach :

library(stringr)

vec_Text <- c("NAMExqzPLACExqzCOLORxqzTIMExqzDIRECTIONxqzSERVICE",
              "JIM     xqz1101xqzREDxqz1200xqzWESTxqzSurgery",
              "RALPH   xqz2201xqzBLUxqz1201xqzNORTxqzObservation",
              "JEAN    xqz3301xqzGRExqz1202xqzSOUTxqzMedical")


fileConn <- file("output.csv")
writeLines(vec_Text, fileConn)
close(fileConn)

text <- readLines("output.csv")
text <- stringr::str_replace_all(text, "xqz", ";")

fileConn <- file("output_Mod.csv")
writeLines(text, fileConn)
close(fileConn)

text

[1] "NAME;PLACE;COLOR;TIME;DIRECTION;SERVICE" "JIM     ;1101;RED;1200;WEST;Surgery"    
[3] "RALPH   ;2201;BLU;1201;NORT;Observation" "JEAN    ;3301;GRE;1202;SOUT;Medical"