I want to create a covariance matrix from a data frame which is not yet suitable for creating one.
After using RPostgreSQL to query the database I have a data frame of the following type:
pg_id item_id value date
1 67808755896 23.5 2016-11-12
2 223337345 0 2016-11-12
3 254337000000 1 2016-11-12
4 34604777037 0 2016-11-12
5 142223438000 14.3 2016-11-12
6 170555690000 22 2016-11-12
The entire data frame is of ~500 000 rows with roughly 16 000 item_id's. The item_id's are repeated (looking back a couple of months here).
What I want to do eventually is to create a covariance matrix for the values of the item_id's.
In order to to so, as a first step I want to re-arrange the data frame in a way that I end up with a data frame that would look like the following:
item_id
date 67808755896 223337345 254337000000 ...
2016-11-12 value value value
2016-11-12 value value value
2016-11-12 value value value
2016-11-12 value value value
2016-11-12 value value value
2016-11-12 value value value
My problem is, that I don't know of a way to reorder the data frame the way I need to.
If there is a SQL query that would give me the option at the time of retrieval to get the desired structure, I guess that would be best.
Within R I tried a couple of things from using melt as well as spread but the computation seemed to be to heavy for my local mac which the last time I tried it just shut down at some point.
Thanks in advance for any help!
In R, this should run pretty fast: