How to take same random sample from dataset every time

3.6k views Asked by At

I have a dataset that consists of nearly 7 million observations and I want to take a random sample of the data to analyze just a subset. I know how to take a random sample of the data:

index <- sample(7009728, 50000)
flights <- flight[index, ]

Is there a way to take a random sample but once created in my dataset, to always give me the same random sample? I'm hoping to do this without having to rely on saving my R project.

1

There are 1 answers

0
zero323 On BEST ANSWER

Simply use set.seed just before you create index:

> set.seed(1)
> index <- sample(7009728, 50000)
> head(index)
[1] 1861144 2608487 4015546 6366287 1413735 6297463

It sets random number generator seed and ensure consistent results.