Can somebody please help me understand how to do this task:
Use mixture distribution concept to generate 4900 pseudo-random numbers f(x) = 0 when x<-6, 4 when -6<=x<0, 3exp(-2.1x) + 2exp(-1.4x) when x>=0 proportional to this density function. Store the generated values in variable X3.
We did something very similar in class but the teacher didnt explain anything and I have no idea from where he got that stuff. I have already asked the teacher for explanations but he is no help and I just dont understand that stuff.
The task we did in class with solution: Use the mixture distribution concept to generate 10000 values with function f(x) = exp(-2x^2)+0.5*exp(-2x) when x >=-1 and 0 when x<-1 Proportional density function, approximate the corresponding random variables mean based on these values (hint: when generating values, it would ne good to use conditional distribution generation in one case and appropriatly chosen linear transformation in other. SOLUTION:(rstudio)
f <- function(x){
return((exp(-2*x^2)+0.5*exp(-2*x))*(x>= -1))
}
F1_a <- pnorm(-1,sd=0.5) #the first term corresponds to the distribtuon N(0,0.5)
F1_b <- 1
curve(f,-2,10)
d1 <- function(x){
return(dnorm(x,sd=0.5)/(F1_b-F1_a)*(x>=-1))# The conditional distribution density is the density of the base distribution divided by the probability of the interval in the corresponding interval and zero elsewhere.
}
d2 <- function(x){return(dexp(x+1,rate=2))}
int_f <- (sqrt(2*pi)*0.5)*(F1_b-F1_a)+exp(2)/4 #Transform the first term into a density, the integral of the density is expressed through the distribution function.
p1 <- (sqrt(2*pi)*0.5)*(F1_b-F1_a)/int_f
p2 <- exp(2)/4/int_f
curve(f(x)/int_f-p1*d1(x)-p2*d2(x),-2,5)
The density function is indeed representable using the densities of the base distributions and that the weights are correct. Small differences arise from rounding errors occurring in computations with floating-point numbers.
n<-10000
milline <- sample.int(2,n,replace=TRUE,prob = c(p1,p2))
tulem <- rep(NA,n)
mitu <- table(milline)
tulem[milline==1]<- qnorm(runif(mitu[1],F1_a,F1_b),sd=0.5)
tulem[milline==2]<- rexp(mitu[2],2)-1
hist(tulem,30,probability =TRUE)
curve(f(x)/int_f,-1,5,add=TRUE)
mean(tulem)
Ok, here is hopefully simple explanation
You have your function written as
f(x) = C1*f1(x) + C2*f2(x), where C1 and C2 are some constants
you want to make it PDF out of it and generate some samples.
First step is to compute normalization term
N = ∫ dx f(x) = C1* ∫ dx f1(x) + C2* ∫ dx f2(x)
Lets denote
N1 = ∫ dx f1(x) and N2 = ∫ dx f2(x)
Therefore
N = C1*N1 + C2*N2
and
PDF(x) = f(x)/N = [C1*f1(x)+C2*f2(x)]/(C1*N1 + C2*N2)
Lets rewrite is as proper mixture
PDF(x) = f(x)/N = [C1*N1*f1(x)/N1 + C2*N2*f2(x)/N2]/(C1*N1 + C2*N2)
Easy to notice that
f1(x)/N1 is proper PDF in itself which could be used for sampling
PDF1(x) = f1(x)/N1
Same for PDF2
PDF2(x) = f2(x)/N2
And we could write
PDF(x) = W1*PDF1(x) + W2*PDF2(x)
where
W1=C1*N1/(C1*N1 + C2*N2) and W2=C2*N2/(C1*N1 + C2*N2)
W1+W2=1
So here is simple rule for sampling:
That is it. It could be easily extended for more than two number of terms