monte carlo methods (how to get mixed distribution from density function)

109 views Asked by At

Can somebody please help me understand how to do this task:

Use mixture distribution concept to generate 4900 pseudo-random numbers f(x) = 0 when x<-6, 4 when -6<=x<0, 3exp(-2.1x) + 2exp(-1.4x) when x>=0 proportional to this density function. Store the generated values in variable X3.

We did something very similar in class but the teacher didnt explain anything and I have no idea from where he got that stuff. I have already asked the teacher for explanations but he is no help and I just dont understand that stuff.

The task we did in class with solution: Use the mixture distribution concept to generate 10000 values with function f(x) = exp(-2x^2)+0.5*exp(-2x) when x >=-1 and 0 when x<-1 Proportional density function, approximate the corresponding random variables mean based on these values (hint: when generating values, it would ne good to use conditional distribution generation in one case and appropriatly chosen linear transformation in other. SOLUTION:(rstudio)

f <- function(x){
  return((exp(-2*x^2)+0.5*exp(-2*x))*(x>= -1))
}
F1_a <- pnorm(-1,sd=0.5) #the first term  corresponds to the distribtuon N(0,0.5)
F1_b <- 1
curve(f,-2,10)

d1 <- function(x){
  return(dnorm(x,sd=0.5)/(F1_b-F1_a)*(x>=-1))# The conditional distribution density is the density of the base distribution divided by the probability of the interval in the corresponding interval and zero elsewhere.
}
d2 <- function(x){return(dexp(x+1,rate=2))}
int_f <- (sqrt(2*pi)*0.5)*(F1_b-F1_a)+exp(2)/4 #Transform the first term into a density, the integral of the density is expressed through the distribution function.
p1 <- (sqrt(2*pi)*0.5)*(F1_b-F1_a)/int_f
p2 <- exp(2)/4/int_f
curve(f(x)/int_f-p1*d1(x)-p2*d2(x),-2,5)
The density function is indeed representable using the densities of the base distributions and that the weights are correct. Small differences arise from rounding errors occurring in computations with floating-point numbers.
n<-10000
milline <- sample.int(2,n,replace=TRUE,prob = c(p1,p2))
tulem <- rep(NA,n)
mitu <- table(milline)
tulem[milline==1]<- qnorm(runif(mitu[1],F1_a,F1_b),sd=0.5)
tulem[milline==2]<- rexp(mitu[2],2)-1
hist(tulem,30,probability =TRUE)
curve(f(x)/int_f,-1,5,add=TRUE)
mean(tulem)
1

There are 1 answers

0
Severin Pappadeux On

Ok, here is hopefully simple explanation

You have your function written as

f(x) = C1*f1(x) + C2*f2(x), where C1 and C2 are some constants

you want to make it PDF out of it and generate some samples.

First step is to compute normalization term

N = ∫ dx f(x) = C1* ∫ dx f1(x) + C2* ∫ dx f2(x)

Lets denote

N1 = ∫ dx f1(x) and N2 = ∫ dx f2(x)

Therefore

N = C1*N1 + C2*N2

and

PDF(x) = f(x)/N = [C1*f1(x)+C2*f2(x)]/(C1*N1 + C2*N2)

Lets rewrite is as proper mixture

PDF(x) = f(x)/N = [C1*N1*f1(x)/N1 + C2*N2*f2(x)/N2]/(C1*N1 + C2*N2)

Easy to notice that

f1(x)/N1 is proper PDF in itself which could be used for sampling

PDF1(x) = f1(x)/N1

Same for PDF2

PDF2(x) = f2(x)/N2

And we could write

PDF(x) = W1*PDF1(x) + W2*PDF2(x)

where

W1=C1*N1/(C1*N1 + C2*N2) and W2=C2*N2/(C1*N1 + C2*N2)

W1+W2=1

So here is simple rule for sampling:

  1. Sample U01 and compare with W1, if less sample from PDF1(x)
  2. If more sample from PDF2(x)

That is it. It could be easily extended for more than two number of terms