What is the best way to write an R data frame to a SQL Server database

6.6k views Asked by At

I have a R-code data frame that I am trying to write to an existing table in SQL Server. The data frame contains only 8 of about 12 columns contained in the table and the columns in the data frame are not in the same order as the columns in the table. SQL Server is complaining because there are columns missing and other columns that are of the wrong data type (e.g. Varchar (string)vs date, etc.).

I am looking at functions in RODBC and DBI libraries to write the data frame to my SQL Server table, but it is clear that I have to line up the data frame columns in the order expected by the table and to put null place holders in for the missing columns.

What are my options?

Thank you ahead of time for any help you can provide.

2

There are 2 answers

0
GameChanger On BEST ANSWER

So the obvious choice from the RODBC package would be sqlSave(connection,'R data frame','SQL Table'), however as you know that doesn't work. In these cases I write an INSERT INTO statement using sqlQuery(). sqlQuery(connection, "INSERT INTO sqltable (B,C) VALUES ('i1',j1),('i2',j2)...")

Example: We have a SQL table named sqltable with columns A, B, and C. We have an R dataframe with columns B & C named Rdf. With B being of class character and C being of class numeric.

First need to put single quotes around any character fields because we will be using INSERT INTO and SQL likes single quotes around any text Rdf$B <- paste("'",Rdf$B,"'",sep="")

Next we need to format our data so it will look like the VALUES section of an insert statement formatRdf <- do.call(paste,c(Rdf,sep=","))

valuesRdf <- paste("(",paste(formatRdf,collapse="),("),")",sep="")

Final step of preparing the INSERT statement

sql_statement <- paste("INSERT INTO sqltable (B,C) VALUES ",valuesRdf,sep="")

Now use sqlQuery

sqlQuery(connection,sql_statement)
1
jangorecki On

If you are looking for performance then probably rsqlserver will be best choice:

Sql Server driver database interface (DBI) driver for R. This is a DBI-compliant Sql Server driver based on the The .NET Framework Data Provider for SQL Server (SqlClient) System.Data.SqlClient.

Motivation

The .NET Framework Data Provider for SQL Server (SqlClient) uses its own protocol to communicate with SQL Server. It is lightweight and performs well because it is optimized to access a SQL Server directly without adding an OLE DB or Open Database Connectivity (ODBC) layer.

In the wiki of the project you will find benchmarks for RODBC, RJDBC and rsqlserver.


Once you have package to talk to database you follow standard DBI examples, so dbWriteTable or dbSendQuery for create/insert/update.