Finding the Median for every 3 values in a column

Question

Finding the Median for every 3 values in a column

77 views Asked by Rory Holden At 10 January 2024 at 11:38

I would like a line of code that is capable of finding the median of every 3 column values in the Fin1 column. E.g First value would be 5.08 out of 1.54, 5.08 and 5.26. The second would be 5.27 out of 2.79, 5.27 and 8.12. The whole data set contains 36 rows so this operation would have to occur 12 times.

(See data frame below)

Block trial Trial Category   Fin1    Fin2    Fin3   Fin4    Correct FastFin

2      1     2     1      1 1.5424  0.00000 0.00000 0.0000       1  1.5424

7      1     7     3      1 5.2617  0.97171 2.41070 3.8407       1  5.2617

9      1     9     2      1 5.0827  0.00000 0.00000 1.1977       1  5.0827

16     2    16     1      1 5.2732  1.28220 0.00000 3.0692       1  5.2732

19     2    19     2      1 8.1251  6.98210 1.52210 0.0000       1  8.1251

24     2    24     3      1 2.7960  1.87000 0.52903 0.0000       1  2.7960

Original Q&A

There are 2 answers

**r2evans** · Answer 1 · 2024-01-10T13:15:19+00:00

The trick here is figuring out how to "group" every three consecutive rows. Once we have that, this then becomes a simple aggregate by group.

I'll use a grp variable to do that, calculating with cumsum((rownum - 1) %% 3 == 0)

dplyr

library(dplyr)
quux %>%
  group_by(grp = cumsum((row_number() - 1) %% 3 == 0)) %>%
  mutate(val = median(Fin1)) %>%
  ungroup()
# # A tibble: 6 × 12
#   Block trial Trial Category  Fin1  Fin2  Fin3  Fin4 Correct FastFin   grp   val
#   <int> <int> <int>    <int> <dbl> <dbl> <dbl> <dbl>   <int>   <dbl> <int> <dbl>
# 1     1     2     1        1  1.54 0     0      0          1    1.54     1  5.08
# 2     1     7     3        1  5.26 0.972 2.41   3.84       1    5.26     1  5.08
# 3     1     9     2        1  5.08 0     0      1.20       1    5.08     1  5.08
# 4     2    16     1        1  5.27 1.28  0      3.07       1    5.27     2  5.27
# 5     2    19     2        1  8.13 6.98  1.52   0          1    8.13     2  5.27
# 6     2    24     3        1  2.80 1.87  0.529  0          1    2.80     2  5.27

If you needed all Fin# columns, then

quux %>%
  group_by(grp = cumsum((row_number() - 1) %% 3 == 0)) %>%
  mutate(across(starts_with("Fin"), ~ median(.x), .names = "{.col}_median")) %>%
  ungroup()
# # A tibble: 6 × 15
#   Block trial Trial Category  Fin1  Fin2  Fin3  Fin4 Correct FastFin   grp Fin1_median Fin2_median Fin3_median Fin4_median
#   <int> <int> <int>    <int> <dbl> <dbl> <dbl> <dbl>   <int>   <dbl> <int>       <dbl>       <dbl>       <dbl>       <dbl>
# 1     1     2     1        1  1.54 0     0      0          1    1.54     1        5.08        0          0            1.20
# 2     1     7     3        1  5.26 0.972 2.41   3.84       1    5.26     1        5.08        0          0            1.20
# 3     1     9     2        1  5.08 0     0      1.20       1    5.08     1        5.08        0          0            1.20
# 4     2    16     1        1  5.27 1.28  0      3.07       1    5.27     2        5.27        1.87       0.529        0   
# 5     2    19     2        1  8.13 6.98  1.52   0          1    8.13     2        5.27        1.87       0.529        0   
# 6     2    24     3        1  2.80 1.87  0.529  0          1    2.80     2        5.27        1.87       0.529        0

base R

grp <- cumsum((0:(nrow(quux)-1) %% 3) == 0)
grp
# [1] 1 1 1 2 2 2
Fins <- startsWith(names(quux), "Fin")
newFins <- paste0(Fins, "_median")
Fins
# [1] "Fin1" "Fin2" "Fin3" "Fin4"
newFins
# [1] "Fin1_median" "Fin2_median" "Fin3_median" "Fin4_median"
quux[,newFins] <- lapply(setNames(quux[,Fins], paste0(Fins, "_median")), function(val) ave(val, grp, FUN = median))
quux
#    Block trial Trial Category   Fin1    Fin2    Fin3   Fin4 Correct FastFin Fin1_median Fin2_median Fin3_median Fin4_median
# 2      1     2     1        1 1.5424 0.00000 0.00000 0.0000       1  1.5424      5.0827        0.00     0.00000      1.1977
# 7      1     7     3        1 5.2617 0.97171 2.41070 3.8407       1  5.2617      5.0827        0.00     0.00000      1.1977
# 9      1     9     2        1 5.0827 0.00000 0.00000 1.1977       1  5.0827      5.0827        0.00     0.00000      1.1977
# 16     2    16     1        1 5.2732 1.28220 0.00000 3.0692       1  5.2732      5.2732        1.87     0.52903      0.0000
# 19     2    19     2        1 8.1251 6.98210 1.52210 0.0000       1  8.1251      5.2732        1.87     0.52903      0.0000
# 24     2    24     3        1 2.7960 1.87000 0.52903 0.0000       1  2.7960      5.2732        1.87     0.52903      0.0000

Data

quux <- structure(list(Block = c(1L, 1L, 1L, 2L, 2L, 2L), trial = c(2L, 7L, 9L, 16L, 19L, 24L), Trial = c(1L, 3L, 2L, 1L, 2L, 3L), Category = c(1L, 1L, 1L, 1L, 1L, 1L), Fin1 = c(1.5424, 5.2617, 5.0827, 5.2732, 8.1251, 2.796), Fin2 = c(0, 0.97171, 0, 1.2822, 6.9821, 1.87), Fin3 = c(0, 2.4107, 0, 0, 1.5221, 0.52903), Fin4 = c(0, 3.8407, 1.1977, 3.0692, 0, 0), Correct = c(1L, 1L, 1L, 1L, 1L, 1L), FastFin = c(1.5424, 5.2617, 5.0827, 5.2732, 8.1251, 2.796)), class = "data.frame", row.names = c("2", "7", "9",  "16", "19", "24"))

**jay.sf** · Answer 2 · 2024-01-10T13:58:12+00:00

jay.sf On 10 January 2024 at 13:58

Creating a matrix with 3 rows then using matrixStats::colMedians.

> matrixStats::colMedians(matrix(quux$Fin1, nrow=3))
[1] 5.0827 5.2732

Data borrowed from r2evans.

TechQA.

Finding the Median for every 3 values in a column

There are 2 answers

dplyr

base R

Related Questions in R

Related Questions in MEDIAN

Popular Questions

Trending Questions