I have N bernoulli variables, X1
, ..., XN
, and Xi~B(1, pi)
, pi
is known for each Xi
, and Y=X1+...XN
, now I need to get the destribution of Y
.
If Xi
and Xj
is independent when i!=j
, then I can use the simulation:
1. Generate `X1`, ..., `XN` via their distribution, and then get the value of `Y`;
2. Repet step 1 for 10000 times, and then I can get `Y1`, ..., `Y10000`, so I can konw the distribution of `Y`.
But now Xi
and Xj
is dependent, so I also need to take into account the correlation, assuming that corr(Xi, Xj)=0.2
when i!=j
, how can I insert the correlation to the simulation? Or get the distribution of Y via other ways?
Thanks for the help and advise.
You can generate specific pairwise correlations (within limits) by deriving the conditional distribution of one given the other. The limits are that you can't have completely arbitrary p-values and correlations. However, the simultaneous constraints implied by N-choose-2 pairwise sets of correlations will be infeasible for arbitrary choices of N, p-values, and correlations.
The following Ruby implementation shows the calculations for obtaining specified p-values and correlations for a pair of X's: