Loading data into R with rsqlserver package

5.9k views Asked by At

I've just installed rsqlserver like so (no errors)

install_github('rsqlserver', 'agstudy',args = '--no-multiarch')

And created a connection to my database:

> library(rClr)
> library(rsqlserver)

Warning message:
multiple methods tables found for ‘dbCallProc’ 

> drv <- dbDriver("SqlServer")
> conn <- dbConnect(drv, url = "Server=MyServer;Database=MyDB;Trusted_Connection=True;")
> 

Now when I try to get data using dbGetQuery, I get this error:

> df <- dbGetQuery(conn, "select top 100 * from public2013.dim_Date")

Error in clrCall(sqlDataHelper, "GetConnectionProperty", conn, prop) : 
  Type:    System.MissingMethodException
Message: Method not found: 'System.Object System.Reflection.PropertyInfo.GetValue(System.Object)'.
Method:  System.Object GetConnectionProperty(System.Data.SqlClient.SqlConnection, System.String)
Stack trace:
   at rsqlserver.net.SqlDataHelper.GetConnectionProperty(SqlConnection _conn, String prop)

> 

When I try to fetch results using dbSendQuery, I also get an error.

> res <- dbSendQuery(conn, "select top 100 * from public2013.dim_Date")
> df <- fetch(res, n = -1)

Error in clrCall(sqlDataHelper, "Fetch", stride) : 
  Type:    System.InvalidCastException
Message: Object cannot be stored in an array of this type.
Method:  Void InternalSetValue(Void*, System.Object)
Stack trace:
   at System.Array.InternalSetValue(Void* target, Object value)
   at System.Array.SetValue(Object value, Int32 index)
   at rsqlserver.net.SqlDataHelper.Fetch(Int32 capacity) in c:\projects\R\rsqlserver\src\rsqlserver.net\src\SqlDataHelper.cs:line 116

Strangely, the file c:\projects\R\rsqlserver\src\rsqlserver.net\src\SqlDataHelper.cs doesn't actually exist on my computer.

Am I doing something wrong?

2

There are 2 answers

9
agstudy On BEST ANSWER

I am agstudy the creator of rsqlserver package. Sorry for the late but I finally I get some time to fix this bug. ( actually it was a not yet implemented feature). I demonstrate here how you can read/write data.frame with missing values in Sql server.

First I create a data.frame with missing values. It is important to distinguish the difference between numeric and character variables.

library(rsqlserver)
url = "Server=localhost;Database=TEST_RSQLSERVER;Trusted_Connection=True;"
conn <- dbConnect('SqlServer',url=url)
## create a table with some missing value
dat <- data.frame(txt=c('a',NA,'b',NA),
                  value =c(1L,NA,NA,2))

My input looks like this :

# txt value
# 1    a     1
# 2 <NA>    NA
# 3    b    NA
# 4 <NA>     2

I insert dat in my data base with the handy function dbWriteTable: dbWriteTable(conn,name='T_TABLE_WITH_MISSINGS', dat,row.names=FALSE,overwrite=TRUE) Then I will read it using 2 methods:

dbSendQuery

res = dbSendQuery(conn,'SELECT * 
                  FROM T_TABLE_WITH_MISSINGS')
fetch(res,n=-1)
dbDisconnect(conn)
   txt value
1    a     1
2 <NA>   NaN
3    b   NaN
4 <NA>     2

dbReadTable:

rsqlserver is DBI compliant and implement many convenient functions to deal at least at possible with SQL.

conn <- dbConnect('SqlServer',url=url)
dbReadTable(conn,name='T_TABLE_WITH_MISSINGS')
dbDisconnect(conn)
   txt value
1    a     1
2 <NA>   NaN
3    b   NaN
4 <NA>     2
4
j-m On

(EDIT: I had missed something in your post (call to fetch). I can now reproduce the issue too.)

Short story is: do you have a NULL value in your database? this may be the cause.

Longer story, for a full repro:

I've used a sample DB reproducible by following the instructions at http://www.codeproject.com/Tips/326527/Create-a-Sample-SQL-Database-in-Less-Than-2-Minute

EDIT: I can reproduce your issue with:

library(rClr)
library(rsqlserver)
drv <- dbDriver("SqlServer")
conn <- dbConnect(drv, url = "Server=Localhost\\somename;Database=Fabrics;Trusted_Connection=True;")
res <- dbSendQuery(conn, "SELECT TOP 100 * FROM [Fabrics].[dbo].[Client]")
str(res)
## Formal class 'SqlServerResult' [package "rsqlserver"] with 1 slots
  ..@ Id:<externalptr> 
> df <- fetch(res, n = -1)
Error in clrCall(sqlDataHelper, "Fetch", stride) : 
  Type:    System.InvalidCastException
Message: Object cannot be stored in an array of this type.
Method:  Void InternalSetValue(Void*, System.Object)
Stack trace:
   at System.Array.InternalSetValue(Void* target, Object value)
   at System.Array.SetValue(Object value, Int32 index)
   at rsqlserver.net.SqlDataHelper.Fetch(Int32 capacity) in c:\projects\R\rsqlserver\src\rsqlserver.net\src\SqlDataHelper.cs:line 116

the following commands suggest things work as expected if using other commands.

> dbExistsTable(conn, name='Client')
Error in sqlServerExecScalar(conn, statement, ...) : 
  Message: There is already an open DataReader associated with this Command which must be closed first.
> dbClearResult(res)
[1] TRUE
> dbExistsTable(conn, name='Client')
[1] TRUE
> dbExistsTable(conn, name='SomeIncorrectColumn')
[1] FALSE

Note that I cannot reproduce the very odd one about MissingMethodException

df <- dbGetQuery(conn, "SELECT TOP 100 * FROM [Fabrics].[dbo].[Client]")
Error in clrCall(sqlDataHelper, "Fetch", stride) : 
  Type:    System.InvalidCastException
Message: Object cannot be stored in an array of this type.
Method:  Void InternalSetValue(Void*, System.Object)
Stack trace:
   at System.Array.InternalSetValue(Void* target, Object value)
   at System.Array.SetValue(Object value, Int32 index)
   at rsqlserver.net.SqlDataHelper.Fetch(Int32 capacity) in c:\projects\R\rsqlserver\src\rsqlserver.net\src\SqlDataHelper.cs:line 116

Since the debug symbols seem present, I can debug it further through visual studio. It bombs in SqlDataHelper.Fetch at

_resultSet[_cnames[i]].SetValue(_reader.GetValue(i), cnt);

and the variable watch gives me:

i   11  int
_cnames[i]  "Street2"   string
_reader.GetValue(i) {}  object {System.DBNull}
_reader.GetValue(i-1)   "806 West Sir Francis Drake St" object {string}
_reader.GetValue(i+1)   "Spokane"   object {string}

The entry for Street2 is indeed a NULL:

ClientId    FirstName   MiddleName  LastName    Gender  DateOfBirth CreditRating    XCode   OccupationId    TelephoneNumber Street1 Street2 City    ZipCode Longitude   Latitude    Notes
1   Nicholas    Pat Kane    M   1975-10-07 00:00:00.000 3   ZU8 5ML 4   (279) 459 - 2707    2870 North Cherry Blvd. NULL    Carlsbad    64906   32.7608137325835    117.112738329071    

For information, sessionInfo() output includes:

R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
other attached packages:
[1] rsqlserver_1.0 rClr_0.5-2    
loaded via a namespace (and not attached):
[1] DBI_0.2-7   tools_3.0.2

Hope this helps.