Interpreting Anomaly detection R values

276 views Asked by At

I have an assignment in which I need to detect anomalies in a dataset. I'm using the 'anomalize' package in R and was wondering how to interpret the following output values of the 'anomalize' function:

Remainder_L1 Remainder_L2

I've checked the documentation but I'm unable to find the calculation method for these values. Can someone explain this calculation?

Anomalize output

1

There are 1 answers

0
stevec On BEST ANSWER

The anomolize documentation gives a great example of how to apply anomolize() to a time series

This generates the Remainder_L1 and Remainder_L2 values for CRAN tidyverse downloads (that data comes with the anomolize package, so no need to import data, just run the code below to see how it generates the columns


# install.packages("anomalize")

library(tidyverse)
library(tibbletime)
library(anomalize)


tidyverse_cran_downloads %>%
    time_decompose(count, merge = TRUE) %>%
    anomalize(remainder) 

 #   package date       count observed season trend remainder remainder_l1 remainder_l2 anomaly
 #   <chr>   <date>     <dbl>    <dbl>  <dbl> <dbl>     <dbl>        <dbl>        <dbl> <chr>  
 # 1 broom   2017-01-01  1053    1053. -1007. 1708.    352.         -1725.        1704. No     
 # 2 broom   2017-01-02  1481    1481    340. 1731.   -589.         -1725.        1704. No     
 # 3 broom   2017-01-03  1851    1851    563. 1753.   -465.         -1725.        1704. No     
 # 4 broom   2017-01-04  1947    1947    526. 1775.   -354.         -1725.        1704. No     
 # 5 broom   2017-01-05  1927    1927    430. 1798.   -301.         -1725.        1704. No  

What do these values mean? From the anomolize source code we see:

"remainder_l1" (lower limit for anomalies), "remainder_l2" (upper limit for anomalies)

In the example above, it's saying in the first row, anomolize() would treat the value (1053) as an anomoly if it was less than -1725, or greater than 1725.