how can I do vector integral in R?

1.9k views Asked by At

I want to integrate a one dimensional vector in R, How should I do that? Let's say I have:

d=hist(p, breaks=100, plot=FALSE)$density

where p is a sample like:

p=rnorm(1e5)

How can I calculate an integral over d?

1

There are 1 answers

0
Jellen Vermeir On BEST ANSWER

If we assume that the values in d correspond to the y values of a function then we can calculate the integral by using a discrete approximation. We can for example use the trapezium rule or Simpsons rule for this purpose. We then also need to input the stepsize that corresponds to the discrete interval on the x-axis in order to "approximate the area under the curve".

Discrete integration functions defined below:

p=rnorm(1e5)
d=hist(p,breaks=100,plot=FALSE)$density

discreteIntegrationTrapeziumRule <- function(v,lower=1,upper=length(v),stepsize=1)
{
  if(upper > length(v))
    upper=length(v)
  if(lower < 1)
    lower=1

  integrand <- v[lower:upper]
  l <- length(integrand)
  stepsize*(0.5*integrand[1]+sum(integrand[2:(l-1)])+0.5*v[l])
}

discreteIntegrationSimpsonRule <- function(v,lower=1,upper=length(v),stepsize=1)
{
  if(upper > length(v))
    upper=length(v)
  if(lower < 1)
    lower=1

  integrand <- v[lower:upper]
  l <- length(integrand)
  a = seq(from=2,to=l-1,by=2);
  b = seq(from=3,to=l-1,by=2)
  (stepsize/3)*(integrand[1]+4*sum(integrand[a])+2*sum(integrand[b])+integrand[l])
}

As an example, let's approximate the complete area under the curve while assuming discrete x steps of size 1 and then do the same for the second half of d while we assume x-steps of size 0.2.

> plot(1:length(d),d) # stepsize one on x-axis
> resultTrapeziumRule <- discreteIntegrationTrapeziumRule(d) # integrate over complete interval, assume x-stepsize = 1
> resultSimpsonRule <- discreteIntegrationSimpsonRule(d) # integrate over complete interval, assume x-stepsize = 1
> resultTrapeziumRule
[1] 9.9999
> resultSimpsonRule
[1] 10.00247

> plot(seq(from=-10,to=(-10+(length(d)*0.2)-0.2),by=0.2),d) # stepsize 0.2 on x-axis
> resultTrapziumRule <- discreteIntegrationTrapeziumRule(d,ceiling(length(d)/2),length(d),0.2) # integrate over second part of vector, x-stepsize=0.2
> resultSimpsonRule <- discreteIntegrationSimpsonRule(d,ceiling(length(d)/2),length(d),0.2) # integrate over second part of vector, x-stepsize=0.2
> resultTrapziumRule
[1] 1.15478
> resultSimpsonRule
[1] 1.11678

In general, the Simpson rule offers better approximations of the integral. The more y-values you have (and the smaller the x-axis stepsize), the better your approximations will become.

Small EDIT for clarity: In this particular case the stepsize should obviously be 0.1. The complete area under the density curve is then (approximately) equal to 1, as expected.

> d=hist(p,breaks=100,plot=FALSE)$density
> hist(p,breaks=100,plot=FALSE)$mids # stepsize = 0.1
 [1] -4.75 -4.65 -4.55 -4.45 -4.35 -4.25 -4.15 -4.05 -3.95 -3.85 -3.75 -3.65 -3.55 -3.45 -3.35 -3.25 -3.15 -3.05 -2.95 -2.85 -2.75 -2.65 -2.55
[24] -2.45 -2.35 -2.25 -2.15 -2.05 -1.95 -1.85 -1.75 -1.65 -1.55 -1.45 -1.35 -1.25 -1.15 -1.05 -0.95 -0.85 -0.75 -0.65 -0.55 -0.45 -0.35 -0.25
[47] -0.15 -0.05  0.05  0.15  0.25  0.35  0.45  0.55  0.65  0.75  0.85  0.95  1.05  1.15  1.25  1.35  1.45  1.55  1.65  1.75  1.85  1.95  2.05
[70]  2.15  2.25  2.35  2.45  2.55  2.65  2.75  2.85  2.95  3.05  3.15  3.25  3.35  3.45  3.55  3.65  3.75  3.85  3.95  4.05  4.15

> resultTrapeziumRule <- discreteIntegrationTrapeziumRule(d,stepsize=0.1) 
> resultTrapeziumRule
[1] 0.999985