I have a 20GB transaction data set from kaggle (http://www.kaggle.com/c/acquire-valued-shoppers-challenge/data).
row are over 300 million and variables are 11.
It is too heavy to handle with R. So I want to filter data.

id class is interger64.
Unique id has 311541 and I want sample 20000.
I'm using data.table But there is an error like the picture.
Is there a way to sample id?
If I recall correctly,
integer64are justdoubles masked asinteger. Maybe the best way to obtain your subset without making any copy is to use thesetattrfunction indata.table. Try this: