like I don't understand< sorry like how old's your" /> like I don't understand< sorry like how old's your" /> like I don't understand< sorry like how old's your"/>

Read in a dataframe from .txt file with special characters in R

625 views Asked by At

I have speech transcriptions with lots of special characters in a column in a dataframe, like so:

">like I don't understand< sorry like how old's your mom¿"
"°ye[a:h]°"
"°I don't know°"

When I read-in the dataframe using read.table, I get the following output where several funny new characters have incorrecly been inserted:

Output in R:

">like I don't understand< sorry like how old's your mom¿"
"°ye[a:h]°"
"°I don't know°"

How can I fix this issue?

1

There are 1 answers

0
Johan Rosa On BEST ANSWER

You can specify the enconding while importing or just it after importing the data.

Option 1

df <- read.table('path/file.ext', econding = "UTF-8", ...)

Option 2

x <- c(
  ">like I don't understand< sorry like how old's your mom¿",
  "°ye[a:h]°",
  "°I don't know°")

Encoding(x) <- 'UTF-8'

print(x)