First, I want to create a column that randomize 1s and 0s by group while maintaining the same proportion of 1s and 0s in another column.
Second, I want to repeat the above procedure many times (say 1000) and calculate the expected value.
Let me clarify with hypothetical data.
library(data.table)
district <- c(1,1,1,1,2,2,2,2,2,3,3,3,3,3,3,3)
village <- c(1,2,3,4,1,2,3,4,5,1,2,3,4,5,6,7)
status <- c(1,0,1,0, 1,1,1,0,0,1,1,1,1,0,0,0)
datei <- data.table(district, village, status)
What I want to do is I want to create a column that randomize 1s and 0s within a district while maintaining the same proportion of 1s and 0s in status; the proportions of 1:0 are 2:2, 3:2 and 4:3 in district 1, 2 and 3 respectively.
Second, I also want to repeat this randomization many times (say 1000 times) and calculate the expected value for each row.
I know how to randomize 1s and 0s based on district.
datei[, random_status := sample(c(1,0), .N, replace=TRUE), keyby = district]
However, I do not know how to have the same proportion of 1s and 0s as in status and how to repeat and calculate the expected values for each row.
Many thanks.
Edit: Let me add what I expect regarding calculating the expected values for each raw after, say, 1000 repetitions. Column exp_status is generated after randomizing many times while keeping the proportion of 1:0 within district is the same as in status.
district | village | status | exp_status |
---|---|---|---|
1 | 1 | 1 | 0.9 |
1 | 2 | 0 | 0.7 |
1 | 3 | 1 | 0.8 |
1 | 4 | 0 | 0.1 |
2 | 1 | 1 | 0.2 |
2 | 2 | 1 | 0.3 |
2 | 3 | 1 | 0.2 |
2 | 4 | 0 | 0.9 |
2 | 5 | 0 | 0.8 |
3 | 1 | 1 | 0.4 |
3 | 2 | 1 | 0.5 |
3 | 3 | 1 | 0.9 |
3 | 4 | 1 | 0.8 |
3 | 5 | 0 | 0.9 |
3 | 6 | 0 | 0.8 |
3 | 7 | 0 | 0.7 |
Use a
table
asprob=
, which gives on large scale similar proportions.Data:
(slightly blown up, to 1e5 rows)