Rarefaction curve

Question

Rarefaction curve

661 views Asked by Aduragbemi At 24 June 2023 at 04:18

I would like to ask why my rarefaction curve both for the rarefied data and non rarefied doesnt seem ideal. What could be a cause to this?

This was the script i used:

sam.data_soil_rare <- data.frame(bac.unedited@sam_data)
sam.data_soil_rare 
OTU.table_soil_rare <- otu_table(bac.unedited) %>%
  as.data.frame() %>%
  as.matrix()
OTU.table_soil_rare

raremax_soil <- min(rowSums(t(OTU.table_soil_rare)))
kbd>rare.fun_soil <- rarecurve(t(OTU.table_soil_rare), step = 200, sample = raremax, tidy = T)
rare.fun_soil
bac.rare.curve.extract2_soil <- left_join(sam.data_soil_rare, rare.fun_soil, by = c("Samples" = "Site"))
bac.rare.curve.extract2_soil

bac.rare_soil_plot <- ggplot(bac.rare.curve.extract2_soil, aes(x = Sample, y = Species, group = Samples,
color = Sample_or_Control)) + 
  #geom_point() +
  geom_line(size = 1) + 
  xlab("Reads") +
  ylab("Number of ASVs") +
  ggtitle("Soil") +
  theme_classic() + 
  theme(legend.position="none") +
  geom_vline(xintercept = median(sample_sums(soil_bac_sub_no_bad_Filtered)), linetype = "solid") +
  scale_color_manual(values = cbbPalette)
bac.rare_soil_plot

I tried using a different package to check the rarefaction curve on unfiltered phyloseq but got the same typical non-ideal graph.

Original Q&A

There are 1 answers

**Jari Oksanen** · Answer 1 · 2023-06-24T09:33:00+00:00

There is no reproducible example, and you do not explain how the curves are "non-ideal". They look pretty normal to me.

My guess is that you have multiplied your data with some arbitrary numbers and this makes them unsuitable for rarefaction. Perhaps you have even pre-processed data and removed some of the rarest cases as "read errors" which makes data even less suitable for rarefaction.

Rarefaction should be applied for observed data. In general, rarefaction works by subsampling your observed data, and in this process some species drop off first and this reduces the number of species. If you have taxa that only occur once or twice, they are often some of the first to drop off, and this gives smoothly changing rarefaction curve. If you have multiplied your data (like is common in your field), the taxon that you have only observed once will be multiplied, say, to abundance 1000 (or what ever is your multiplier). Dropping off taxa with such huge numbers happens much more slowly, and for long time you have no reduction in the number of species and you get that horizontal line. Only when you have subsampled to very low fractions, you undo your multiplication and then all species start dropping off very rapidly and you get that very steep part of the curve.

You can see the effect of multiplication by comparing these two models using vegan data:

library(vegan)
data(BCI)
rarecurve(BCI) # observed counts: correct
rarecurve(10 * BCI) # multiplied: wrong

Current version of vegan issues a warning if you do not have any counts of 1. This may be a false positive, but it should alert you to check that you are using observed data instead of multiplied data or data with rare cases removed.

TechQA.

Rarefaction curve

There are 1 answers

Related Questions in R

Related Questions in VEGAN

Related Questions in PHYLOSEQ

Popular Questions

Trending Questions