I am trying to switch from RODBC, which is depreciated, to DBIand odbc packages to retrieve data from a Filemaker database. It is configured to use UTF-8 encoding. I have configured unixodbc and using isql I am able to successfully retrieve data, including diacritics. For example, running isql -v mydb myuid mypwd and select Contexte from Mobilier where ID_Objet = 1 returns Dépôt, which is correct (the terminal is also configured to use UTF-8).
However, when I am switching to R, things are getting more complicated. For example, consider the code below:
library(DBI)
library(odbc)
con <- dbConnect(odbc::odbc(),
driver = "/Library/ODBC/FileMaker ODBC.bundle/Contents/MacOS/fmodbc.so",
database = "mydb",
uid = "myuid",
pwd = "mypwd",
host = "localhost")
val <- dbGetQuery(con, "select Contexte from Mobilier where ID_Objet = 1")
print(val)
Contexte
1 D\u008ep\u0099t
With RODBCeverything is working as expected:
library(RODBC)
con <- odbcConnect("mydb", uid = "myuid", pwd = "mypwd")
val <- sqlQuery(con, "select Contexte from Mobilier where ID_Objet = 1")
print(val)
Contexte
1 Dépôt
sessionInfo() returns :
R version 4.3.3 (2024-02-29)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.6.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Brussels
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RODBC_1.3-23
loaded via a namespace (and not attached):
[1] compiler_4.3.3 tools_4.3.3 rstudioapi_0.15.0
Any idea what I'm doing wrong?
EDIT
Following r2evans comment, I tried inserting some text in my database using RODBC and retrieving it with DBI:
library(RODBC)
library(DBI)
library(odbc)
con1 <- RODBC::odbcConnect("mydb", uid = "myuid", pwd = "mypwd")
con2 <- DBI::dbConnect(odbc::odbc(),
driver = "/Library/ODBC/FileMaker ODBC.bundle/Contents/MacOS/fmodbc.so",
database = "mydb",
uid = "myuid",
pwd = "mypwd",
host = "localhost")
# Data with diacritics to be inserted using RODBC
insert <- data.frame("ID_Objet" = 5000,
"Contexte" = "Dépôt")
# The encoding is right
Encoding(insert$Contexte)
> [1] "UTF-8"
# Inserting the data with RODBC...
RODBC::sqlSave(con1, insert)
# ...and retrieving it
x <- RODBC::sqlQuery(con1, "select * from insert")
# Diacritics are alright but not the encoding...
x$Contexte
> [1] "Dépôt"
Encoding(x$Contexte)
> [1] "unknown"
# Reading now the same data with DBI and odbc
y <- DBI::dbReadTable(con2, "insert")
# Diacritcs are gone but the original encoding is detected
y$Contexte
> [1] "D\u008ep\u0099t"
Encoding(y$Contexte)
>[1] "UTF-8"