How to use dissimilarity matrix with function metaMDS?

957 views Asked by At

I have a matrix derived from a table with three original columns: column 1 = site codes, column 2 = species codes and column 3 = biomass weight for each species. The biomass weight of each species in each plot is displayed in the matrix. The matrix can be calculated with one of the three following options (thanks to feedback on an earlier question):

reshape::cast(dissimBiom, plot ~ species, value = 'biomass', fun = mean)
by(dissimBiom, dissimBiom$biomass, function(x) with(x, table(plot, species)))
tapply(dissimBiom$biomass,list(dissimBiom$plot,dissimBiom$species),mean)

Note: dissim is the .csv file name for two-column table; dissimBiom is the .csv file name for three-column table.

I now would like to generate a dissimilarity matrix based on the above matrix. The below code requires packages vegan and ecodist.

I had earlier used the function

matrix <- with(dissim, table(plot,species))

to generate a matrix based on two columns (site vs species) only and then used

matrix.meta <- metaMDS(matrix, k=2, distance = "bray", trymax=10) 

to generate a dissimilarity matrix. This worked just fine.

In contrast, attempts to generate a dissimilarity matrix where the matrix has been generated with one of the following codes (as above)

reshape::cast(dissimBiom, plot ~ species, value = 'biomass', fun = mean)
by(dissimBiom, dissimBiom$biomass, function(x) with(x, table(plot, species)))
tapply(dissimBiom$biomass,list(dissimBiom$plot,dissimBiom$species),mean

using the same function

matrixBiom.meta <- metaMDS(matrixBiom, k=2, distance = "bray", trymax=10)

results in the following error message

Error in if (any(autotransform, noshare > 0, wascores) && any(comm < 0)) { : 
  missing value where TRUE/FALSE needed

Note: I call matrixBiom from the file matrixBiom.csv which I wrote to convert the NA's to 0, using

write.csv(matrixBiom, "matrixBiom.csv", na="0",row.names=TRUE)

In contrast to matrixBiom.meta, matrix.meta was directly used on 'matrix' without writing a .csv file.

Also, the matrix generated by

matrix <- with(dissim, table(plot,species))

looks like this,

               species
    plot        xanfla1 xangria xanret 
      a100f177r       1.4       0      8.9      
      a100f562r       0       5.6      0      
      a100f56r        22.4       0      1.3 

while the matrix generated by either of the other approaches has the format

zinunk ziz150 zizang 
a100f177r     22.4     NA     2.6     
a100f562r     1.3     NA     NA     
a100f56r      NA     3.1     NA     
a100f5r       NA     NA     0.2 

My questions would be,

1) In either of these functions

reshape::cast(dissimBiom, plot ~ species, value = 'biomass', fun = mean)
by(dissimBiom, dissimBiom$biomass, function(x) with(x, table(plot, species)))
tapply(dissimBiom$biomass,list(dissimBiom$plot,dissimBiom$species),mean

can NAs directly be converted to 0s to avoid writing and reading in a .csv file, maybe this would solve the problem?

2) What fixes could be used for the three-column table example to conduct an NMDS using metaMDS?

3) Is there alternative functions to calculate a dissimilarity matrix for the three-column table example?

Any advice would be very much appreciated.

Please find a reproducible data subset below:

> dput(dataframe)
structure(list(plot = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L), .Label = c("a1f17r", 
"a1f56r", "a1m17r", "a1m5r"), class = "factor"), species = structure(c(12L, 
29L, 16L, 21L, 24L, 19L, 6L, 13L, 14L, 5L, 16L, 12L, 26L, 9L, 
29L, 28L, 17L, 15L, 25L, 6L, 3L, 8L, 27L, 6L, 1L, 7L, 18L, 10L, 
12L, 11L, 2L, 20L, 13L, 27L, 22L, 23L, 4L, 1L), .Label = c("annunk", 
"blurip", "cae089", "caepar", "chrodo", "clihir", "dalpin", "derele", 
"embphi", "ficmeg", "indunk", "jactom", "leeind", "merbor", "mergra", 
"mikcor", "nep127", "nepbis", "nepbis1", "palunk", "rubcle", 
"sinirp", "spagyr1", "sphoos", "stitrut", "tetped", "tinpet", 
"uncgla", "zinunk"), class = "factor"), biomass = c(100.6, 284.6, 
13.8, 2.8, 1, 3.1, 8.8, 0.5, 15.2, 13.8, 6.1, 5.3, 18.8, 4.1, 
199, 68, 143.3, 11.3, 6.5, 0.2, 54.1, 39, 22, 1.2, 6.3, 6, 0.1, 
2.8, 42, 1.9, 0.1, 0.2, 0.2, 0.1, 2.1, 4.3, 0.7, 0.2)), .Names = c("plot", 
"species", "biomass"), class = "data.frame", row.names = c(NA, 
-38L))
1

There are 1 answers

0
Gavin Simpson On BEST ANSWER

Question 1:

Not easily, so do it in a secondary step. I find the tapply() result neater so I'll go with that: (assuming your example data is in dat)

dat2 <- as.data.frame(with(dat, tapply(biomass, list(plot, species), mean)))

giving

> dat2[, 1:6]
       annunk blurip cae089 caepar chrodo clihir
a1f17r     NA     NA     NA     NA     NA    0.2
a1f56r     NA     NA     NA     NA   13.8    8.8
a1m17r    0.2     NA     NA    0.7     NA     NA
a1m5r     6.3    0.1   54.1     NA     NA    1.2

Then to convert NA to 0 we do

dat2[is.na(dat2)] <- 0

which gives us

> dat2[, 1:6]
       annunk blurip cae089 caepar chrodo clihir
a1f17r    0.0    0.0    0.0    0.0    0.0    0.2
a1f56r    0.0    0.0    0.0    0.0   13.8    8.8
a1m17r    0.2    0.0    0.0    0.7    0.0    0.0
a1m5r     6.3    0.1   54.1    0.0    0.0    1.2

Question 2:

Given the solution to Q1, there are no further steps required.

Question 3:

Follow the solution in Question 1 above and then run dist() or vegdist() or some other function that can compute dissimilarity matrices from data frame objects.