Using FFT to sum independent random variables

176 views Asked by At

Based on the very useful answer to another question, I have written a function to compute the density function of a sum of random variables. The function takes two density functions f1 and f2, then produces the sum PDF of the sum of n1 and n2 random variates having these densities, respectively (so summing n1+n2 independent random variables):

dsumf2 <- function(f1, f2, n1=1, n2=1, a, b, k=2^14) {
  # Perform FFT
  x <- seq(a, b, length.out = k)
  p1 <- f1(x)
  p2 <- f2(x)
  s1 <- sum(p1)
  s2 <- sum(p2)
  d0 = fft(fft(p1/s1)^n1*fft(p2/s2)^n2, TRUE)
  d0 = Re(d0) * exp(n1/(n1+n2)*log(s1) + n2/(n1+n2)*log(s2)-log(k))
  data.frame(
    x = x, 
    d = d0
  )
}

The function appears to work well when the minimum is 0:

## Works well; test with Monte Carlo
## Sum 2 gamma(1,5) and 1 gamma(2, 10)
x = rgamma(100000,shape = 1, scale = 5) + rgamma(100000,shape = 1, scale = 5) + rgamma(100000,shape = 2, scale = 10)
hist(x, freq=FALSE, breaks = 100)
dsum <- dsumf2(\(x) dgamma(x, shape = 1, scale = 5), \(x) dgamma(x, shape = 2, scale = 10), n1=2, a=0, b=150)
lines(dsum$x, dsum$d, col = 'red')

Sum of three gamma variates

However, it fails when the minimum is not 0, as in this example:

x = rnorm(100000) + rnorm(100000) + rnorm(100000,-5,5)
hist(x, freq=FALSE, breaks = 100)
dsum <- dsumf2(\(x) dnorm(x), \(x) dnorm(x, -5, 5), n1 = 2, a = min(x)-10, b = max(x)+1)
lines(dsum$x, dsum$d, col = 'red')

Incorrect PDF of the sum of three normal variates

The resulting PDF wraps around, which is similar to the issue in this question, so I'm assuming it is the same issue.

plot(dsum$x, dsum$d, col = 'red', typ='l')

Periodicity and wrap around in the resulting PDF

I've tried using SynchWave::ifftshift and SynchWave::fftshift as in that answer, but I can't get it to work in general. Is there a way to write the function to shift the PDF in such a way that it is correct, regardless of the underlying distributions and their support?

2

There are 2 answers

2
Sina Salam On

I have seen some comments that are very helpful but no response to know if you get this done. However, to address the periodicity issue in your function dsumf2, you can indeed utilize the fftshift operation to properly align the Fourier transformed densities.

dsumf2 <- function(f1, f2, n1=1, n2=1, a, b, k=2^14) {
  # Perform FFT
  x <- seq(a, b, length.out = k)
  p1 <- f1(x)
  p2 <- f2(x)
  s1 <- sum(p1)
  s2 <- sum(p2)
  
  # Perform FFT with proper shifting
  d0 <- fftshift(fft(fftshift(p1/s1))^n1 * fft(fftshift(p2/s2))^n2)
  d0 <- Re(d0) * exp(n1/(n1+n2) * log(s1) + n2/(n1+n2) * log(s2) - log(k))
  
  # Correct for spacing in frequency domain
  dx <- (b - a) / k
  x <- seq(a, b, by = dx)
  
  data.frame(
    x = x,
    d = d0
  )
}

Now, with the above code you can use this modified function to compute the density function of the sum of random variables without experiencing the wrap-around issue. This is a kind of your example with normal distributions:

x <- rnorm(100000) + rnorm(100000) + rnorm(100000, -5, 5)
hist(x, freq=FALSE, breaks = 100)
dsum <- dsumf2(dnorm, function(x) dnorm(x, -5, 5), n1 = 2, a = min(x) - 10, b = max(x) + 1)
lines(dsum$x, dsum$d, col = 'red')

This should give you a correct PDF for the sum of three normal variates without the periodicity issue.

8
Sina Salam On

@richarddmoney, Sequel to the error you posted in the comment.

The issue you're having arises from the manipulation of the Fourier transformed densities and their subsequent inverse transformation. This mismatch in the number of rows occurs due to the incorrect alignment of the frequencies in the Fourier domain.

Therefore, to resolve this problem, you will need to ensure consistency in the number of points used in the Fourier transform and the subsequent inverse transform. This requires adjusting the code to correctly handle the shifting and spacing of the frequency domain.

The modified function with corrections I can put together at this time is here as follows:

library(SynchWave)  # Load the SynchWave package for fftshift operation

dsumf2 <- function(f1, f2, n1=1, n2=1, a, b, k=2^14) {
  # Perform FFT
  x <- seq(a, b, length.out = k)
  p1 <- f1(x)
  p2 <- f2(x)
  s1 <- sum(p1)
  s2 <- sum(p2)
  
  # Perform FFT with proper shifting
  d0 <- fftshift(fft(fftshift(p1/s1))^n1 * fft(fftshift(p2/s2))^n2)
  d0 <- Re(d0) * exp(n1/(n1+n2) * log(s1) + n2/(n1+n2) * log(s2) - log(k))
  
  # Correct for spacing in frequency domain
  dx <- (b - a) / k
  freq <- fftfreq(k, dx)
  
  # Inverse FFT with proper shifting
  d0 <- ifftshift(d0)
  d0 <- fft(d0, inverse = TRUE)
  d0 <- fftshift(d0)
  
  # Correct for spacing in the time domain
  x <- seq(a, b, by = dx)
  
  data.frame(
    x = x,
    d = Re(d0)
  )
}

In the above updated function, it ensures that the number of points in the time domain matches the number of points in the frequency domain, preventing any mismatches in the resulting data frame.

Now, you can use this corrected function to compute the density function of the sum of random variables without encountering the previous errors. If you still encounter issues, please let me know.