I am very new to using servers for data analysis, and I am completely baffled by what a "tbl_df""tbl""data.frame" object, and "tbl_Redshift" "tbl_dbi" "tbl_sql" "tbl_lazy" "tbl" is in R? I know that something related to the Redshift server and tbl-lazy tables but that's it.
I have been following on a previous analyst's work, and I have used collect() to pull a bunch of data onto the local machine, doing some transformation before trying to upload it back to the server. However, I cannot upload the data using compute() for some reason, and I believe the reason is because the object itself became "tbl_df""tbl""data.frame" from "tbl_Redshift" "tbl_dbi" "tbl_sql" "tbl_lazy" "tbl".
Is there a way to transform this back so I can upload it? Thanks in advance.
Let's begin by ensuring the core concepts are clear:
tbl_df,tbl, anddata.frameare all local objects where the data exist in local R memory.tbl_Redshift,tbl_dbi, andtbl_sqlare all remote objects where the data exists in the database and R has a pointer/query/connection to the database in local R memory.The
collectcommand retrieves data from a remote database into local R memory. Hence it converts an object of typetbl_Redshift,tbl_dbi, ortbl_sqlinto an object of typetbl_df,tbl, 0rdata.frame.The
computecommand works for remote objects (tbl_Redshift,tbl_dbi, ortbl_sql) and saves an temporary/intermediate stage of a calculation to the database, creating a new remote object. This helps reduce additional computation. It can not be used to copy data from local R memory.Put another way:
collect: remote data input -> local data outputcompute: remote data input -> remote date outputIt sounds like what you need is : local data input -> remote data output.
For this, there are two broad options:
collect. If the data is never copied from the server into local R memory then you never have to transfer it back.DBI::dbWriteTablecommand for this purpose. Example function below.(This is a cut down version of one of my dbplyr helpers here)