I am unsure of the validity of a point pattern analysis I am attempting using the inhomogeneous L-cross function with simulation envelopes to test for spatial association between two types of points. A plot of the simulation envelope vs. observed data values seems odd (extremely large simulated values), and it suggests inhibition rather than clustering (I would expect clustering looking at a plot of the point pattern).

I have a point pattern containing trees and seedlings in a 150 sq meter (5 x 10 m) plot. Coordinates for the four plot corners are contained in the 4th and 5th columns of the data.

Data:

Species     UTM.E       UTM.N       Plot.UTM.E  Plot.UTM.N
tree        4002027.599 5501253.964 4002024.175 5501253.558
tree        4002027.599 5501254.66  4002033.956 5501251.478
tree        4002028.536 5501254.592 4002027.293 5501268.23
tree        4002032.155 5501252.43  4002037.075 5501266.151
tree        4002033.586 5501253.409     
tree        4002033.692 5501253.512     
tree        4002033.1   5501253.958     
tree        4002032.485 5501264.136     
tree        4002032.144 5501264.748     
tree        4002030.003 5501264.156     
tree        4002030.241 5501266.473     
tree        4002029.094 5501267.435     
tree        4002028.704 5501265.775     
seedling    4002030.41  5501252.891     
seedling    4002030.412 5501252.9       
seedling    4002030.83  5501252.977     
seedling    4002029.896 5501252.863     
seedling    4002029.745 5501253.161     
seedling    4002028.376 5501252.949     
seedling    4002028.681 5501252.579     
seedling    4002028.374 5501252.339     
seedling    4002028.09  5501254.159     
seedling    4002026.928 5501255.562     
seedling    4002026.557 5501255.224     
seedling    4002026.815 5501255.986     
seedling    4002025.22  5501255.444     
seedling    4002024.608 5501254.13      
seedling    4002025.102 5501254.298     
seedling    4002025.482 5501254.06      
seedling    4002025.081 5501254.004     
seedling    4002025.1   5501253.905     
seedling    4002024.644 5501253.774     
seedling    4002026.475 5501256.743     
seedling    4002026.158 5501256.234     
seedling    4002028.481 5501258.382     
seedling    4002028.995 5501257.457     
seedling    4002029.313 5501257.7       
seedling    4002029.4   5501256.325     
seedling    4002029.378 5501255.91      
seedling    4002029.518 5501256.314     
seedling    4002028.519 5501256.774     
seedling    4002028.495 5501256.468     
seedling    4002030.388 5501256.809     
seedling    4002030.701 5501256.626     
seedling    4002029.037 5501260.088     
seedling    4002027.834 5501262.373     
seedling    4002028.002 5501262.844     
seedling    4002028.299 5501262.517     
seedling    4002028.186 5501262.239     
seedling    4002028.735 5501262.656     
seedling    4002028.93  5501262.677     
seedling    4002029.239 5501263.083     
seedling    4002029.744 5501263.277     
seedling    4002029.095 5501263.152     
seedling    4002028.777 5501265.856     
seedling    4002030.527 5501266.125     
seedling    4002031.215 5501266.118     
seedling    4002031.316 5501264.917     
seedling    4002031.027 5501262.104     
seedling    4002032.464 5501263.263     
seedling    4002032.824 5501262.688     
seedling    4002032.394 5501254.205     
seedling    4002032.394 5501254.192     
seedling    4002033.091 5501253.509     
seedling    4002031.179 5501254.413     
seedling    4002031.094 5501253.614     
seedling    4002031.084 5501253.45      
seedling    4002030.944 5501253.069     

I am interested in testing for spatial association among trees and seedlings using the intertype L-function (Lcross). Before testing, I checked for homogeneity of the point process using quadrat counts:

library(spatstat)

#setwd and read file
setwd()
file <- read.csv("tree seeds example.csv",header=TRUE)

#create marked point process with window bounded by plot corners
#window
x <- file$Plot.UTM.E[1:4]
y <- file$Plot.UTM.N[1:4]
w <- owin(poly=list(x=c(x[4],x[3],x[1],x[2]),y=c(y[4],y[3],y[1],y[2])))

#create point process with coordinates for each point and marks for trees vs. seedlings
points <- ppp(file$UTM.E,file$UTM.N,w,marks=file$Species)

#get window edges
e <- edges(w)

#rotate window to 90 degrees (thanks E. Rubak)
a<- angles.psp(e)
points.rotate <- rotate(points, -a[1])

#examine point pattern
plot(points.rotate)

#do quadrat count test and report p-value
M <- quadrat.test(points.rotate,nx=3,ny=3)
p <- M$p.value
p # extremely small p-value rejects null hypothesis of homogeneity

Because I am finding this point pattern appears inhomogenous (both visually and through quadrat counting), I decide to use the inhomogeneous "Lcross" function with simulation envelopes to test for spatial association between trees and seedlings.

I will investigate only 1, 2, 3, and 4 meter lags because I have a small plot size. I run 999 simulations, visually inspect the resulting plot, and compute a p-value for a two sided test using methods from Baddeley et al. 2014.

        #set vector of lag distances to examine for spatial association
        r.vec <- c(0,1,2,3,4) #meters

        #inhomogeneous Lcross function because q test supports inhomogeneity
        inhom <- envelope(points,fun=Lcross.inhom,r=r.vec,funargs=list("tree","seedling"),
nsim=999,correction="isotropic",savefuns=TRUE)
        plot(inhom)

        #get p-val for 4 m lag, according to Baddeley et al. 2014 "On tests of 
        #spatial patterns based on simulation envelopes"
        #equation for two-sided test: "2*min(j+1,m+1-j)/(m+1)

        m <- 999 # number of sims
        obs <- inhom$obs[5] #observed value for lag 5
        sims <- attr(inhom,"simfuns") # get simulation values
        lag5sims <- sims[5,] #get simulation values only for lag 5
        lag5sims <- as.matrix(lag5sims) #change to matrix
        lag5sims <- lag5sims[,2:1000] #drop first r value
        j <- sum(lag5sims>obs)

        2*min(j+1,m+1-j)/(m+1) # get result of significant inhibition (because j is large)

The high value of the computed simulation envelope is extremely large, and the plot just doesn't look right to me. Furthermore, I find significant negative spatial association at the 4 meter lag distance using methods from Baddeley et al. 2014. BUT, looking at a plot of the point pattern, it seems like at 4 meters there could be positive spatial association between seedlings and trees, or at least not extreme negative association. When I run the same code using the homogeneous Lcross function, I actually find significant positive association at 4 meters.

Large simulation envelope using the inhomogeneous Lcross function

Visually, it seems like there should be positive association at higher lag distances

Is use of the inhomogeneous Lcross function inappropriate here, or am I using it incorrectly?

Thanks very much for taking the time to read a long question and for any help.

2

There are 2 answers

0
Adrian Baddeley On BEST ANSWER

The problem with the example code is that, if X is a point pattern, envelope(X, .....) performs a test of complete randomness.

To test for clustering/inhibition in the presence of spatial inhomogeneity, the null hypothesis should be an inhomogeneous Poisson process. You'll need to estimate the inhomogeneous intensity functions of the two types of points, somehow, and then generate simulated point patterns according to these intensities. Here are two ways (if X is your point pattern):

  1. Kernel smoothing: First estimate the intensity of each type of point in the original data by D <- density(split(X)). A simulated realisation from the null hypothesis can now be generated by Y <- rmpoispp(D, types=names(D)). We want this to happen in the envelope command, so just do

    envelope(X, Lcross.inhom, simulate=expression(rmpoispp(D, types=names(D)))).
    The argument simulate specifies that this expression should be evaluated to generate each simulated point pattern.

  2. Using a model: First fit a Poisson point process model to the observed data, e.g. fit <- ppm(X ~ polynom(x,y,3)). Then do

    envelope(fit, Lcross.inhom, lambdaX=fit).

    Since the first argument is a ppm object, this is handled by envelope.ppm. This will generate simulated realisations from the fitted Poisson model, and will compute inhomogeneous L-cross functions from each realisation. The argument lambdaX is passed to Kcross.inhom; see ?Kcross.inhom for an explanation of how this is interpreted.

For full details, see Chapter 10 of the spatstat book.

10
Ege Rubak On

Just a short answer since it is late in my time zone.

You are not using an inhomogeneous null model to simulate from. You are simply simulating two independent homogeneous Poisson processes (one for trees and one for seedlings) as per the documentation of envelope.ppp:

If Y is a point pattern (an object of class "ppp") and simulate=NULL, then we generate nsim simulations of Complete Spatial Randomness (i.e. nsim simulated point patterns each being a realisation of the uniform Poisson point process) with the same intensity as the pattern Y. (If Y is a multitype point pattern, then the simulated patterns are also given independent random marks; the probability distribution of the random marks is determined by the relative frequencies of marks in Y.)

You can save the simulated patterns in envelope.ppp by adding the argument savepatterns = TRUE and then you can extract them by

pat <- attr(inhom, "simpatterns")

If you plot (some of) these by plot(as.solist(pat[1:9]) you see they are homogeneous and look nothing like your original pattern, so they also result in very different Lcross.inhom functions. One way to fix this is to fit a model to the data using ppm and run envelope on the fitted model (dispatched to envelope.ppm). You have very few points in total and in particular very few trees, so the estimate of the intensity of trees is very uncertain and probably sensitive to the bandwidth used. You really need to be careful about what you do here...