I am currently working on a project in which I need to know from which date it is more convinient/pertinent to do birds monitoring. In order to do so, I would like to do a cumulative curve with the cumulative number of species (y-axis) for each date (x-axis). I initially used specaccum
function from vegan
package as if one date was equal to one site. While this work pretty good to give an overview of the number of repetitions (or visits) I'd need on the field, it does not provide me with the information I want to know so that I know around which date it is more pertinent to go out (since temporal aspect is important in birds appearance).
I have also simply taken the number of species per date and visualise it with a ggplot
and it gave me a pretty decent over view, but my supervisor would like a cumulative curve.
Here is a copy of a few lines of my dataframe and what I have done so far :
Data is in french Espèce = Species
PDM<- data.frame(
Espèce= c("Corneille noire", "Alouette des champs", "Pipit farlouse", "Faisan de colchide", "Faisan de colchide", "Faisan de colchide",
"Pipit farlouse", "Pipit farlouse", "Alouette des champs", "Corneille noire", "Mésange charbonnière", "Merle noir",
"Étourneau sansonnet", "Pipit farlouse", "Pipit farlouse", "Alouette des champs", "Pipit farlouse", "Accenteur mouchet",
"Linotte mélodieuse", "Corneille noire", "Corbeau freux", "Alouette des champs", "Pinson des arbres", "Pipit farlouse",
"Merle noir", "Accenteur mouchet", "Mésange bleue", "Pigeon ramier", "Pigeon colombin", "Mésange charbonnière",
"Faisan de colchide", "Mouette rieuse", "Vanneau huppé", "Corneille noire", "Corneille noire", "Pigeon ramier",
"Pipit farlouse"),
Nombre= c(2, 5, 3, 1, 2, 1, 6, 6, 2, 3, 1, 1, 6, 6, 8, 1, 1, 1, 4, 2, 1, 7, 8, 3, 2, 6, 1, 1, 1, 4, 2, 1, 2, 3, 1, 4, 7, 2, 3, 1, 4, 7, 6, 5),
Date = c("04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022",
"04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022", "04/01/2022",
"21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022",
"21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022",
"21/01/2022", "21/01/2022", "21/01/2022", "21/01/2022"))
PDM <- PDM %>%
dplyr::select(Espèce, Date,Nombre) %>%
group_by(Date, Espèce) %>%
summarize(n = sum(Nombre))
PDM$Espèce <- as.factor(PDM$Espèce)
PDM <- PDM[!(PDM$Espèce %in% c("Lièvre variable", "Blaireau d'Europe","Lièvre d'Europe","Chat domestique","Chevreuil","Hermine","Lapin")),] # I removed mammals species, as I only study birds
PDM$Espèce <- droplevels(PDM$Espèce)
PDM <- PDM[order(as.Date(PDM$Date,format = "%d/%m/%Y")),]
PDM.w <- PDM %>% pivot_wider(names_from = "Espèce", values_from = "n",values_fill = 0)
PDM.w<- as.data.frame(PDM.w[,2:(ncol(PDM.w))])
PDM_courbe_2_ALL <- specaccum(PDM.w)
PDM_courbe_2_ALL
plot(PDM_courbe_2_ALL, col = "blue",ci.type = "poly", ci.col = "lightblue", ci.lty = 0, ylab = "Nombre of species",xlab = "Nomber of visits", main = "Accumulation curves Site1", font.sub = 4)
And here the ggplot :
PDM <- PDM %>%
group_by(Date) %>%
summarise(n_sp = length(Espèce))
ggplot(PDM_sp) + aes(x= Date, y = n_sp) +geom_point() + geom_smooth(fill = "lightblue") + theme_classic() + ylab("Number of species")+geom_label_repel(aes(label = as.character(Date)),
box.padding = 0.35,
point.padding = 0.7,
segment.color = 'black')+ labs(title = "Number of species through time", subtitle = "Site1")
And here is what I am looking for : Red line representing what I would like It seems pretty simple to do but for some reasons I have a hard time to figure out how to count the number of new species for each date (in order to cumulate each of those number afterwards). I would really appreciate your feedback.
There are many ways, but since you used
vegan::specaccum
, you can use that also for adding sampling units in arbitrary order with argumentmethod="collector"
This should work (untested, don't even have your data):