calcola_mad <- function(dati) {
mediana_calcolata <- quantile(dati, 0.5)
deviazioni_assolute <- abs(dati - mediana_calcolata)
mad_calcolata <- mean(deviazioni_assolute)
return(mad_calcolata)
}
mad<-calcola_mad(gapminder$lifeExp)
meanLife_tbl <- gapminder_tbl %>%
group_by(year,continent) %>%
summarise(media=mean(lifeExp),mediana=percentile(lifeExp,0.5), s=sd(lifeExp))%>%
mutate(md=mad)%>%
filter(continent=='Asia')
I'm trying to compute mad in a pipeline, using sparklyr in R. i shouldn't collect anything in memory. I'm using the gapminder dataset. How do i write this function to use it in spark?