I would like to winsorize the column "Return". However, I want to winsorize per Year. My data looks like this:
structure(list(Name = c("A", "A", "A", "A", "A", "A", "B", "B",
"B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), Date = c("01.09.2018",
"02.09.2018", "03.09.2018", "05.11.2021", "06.11.2021", "07.11.2021",
"01.09.2018", "02.09.2018", "03.09.2018", "05.11.2021", "06.11.2021",
"07.11.2021", "01.09.2018", "02.09.2018", "03.09.2018", "05.11.2021",
"06.11.2021", "07.11.2021"), Return = c(0.05, 0.1, 0.8, 1.5,
0.1, -2, 0.4, 0.6, 0.6, 0.2, -0.5, -0.9, -0.4, 1.7, 0.3, -4,
0.6, 0.5), Year = c("2018", "2018", "2018", "2021", "2021", "2021",
"2018", "2018", "2018", "2021", "2021", "2021", "2018", "2018",
"2018", "2021", "2021", "2021")), row.names = c(NA, -18L), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), groups = structure(list(Year = c("2018",
"2021"), .rows = structure(list(c(1L, 2L, 3L, 7L, 8L, 9L, 13L,
14L, 15L), c(4L, 5L, 6L, 10L, 11L, 12L, 16L, 17L, 18L)), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -2L), .drop = TRUE))
I tried the following code which also works:
Data <- Data %>%
group_by(Year) %>%
mutate("Winsorized Return" = Winsorize(`Return`, probs=c(0.01, 0.99)))
Now I do the same but without grouping the data per year:
Data <- Data %>%
mutate("Winsorized Return" = Winsorize(`Return`, probs=c(0.01, 0.99)))
I get exactly the same results even though I used "group_by(Years) in the first code.
Can someone explain me why the results are the same? Even when I do the same with my real (very large) dataset I get the same results.
Do I have to use another code in order to winsorize the column "Return" per year?
Thank you very much already for any help!