Is there a way to sum specific rows of a column?

189 views Asked by At

A sample picture attached The sample dataset file I have a dataset (please see the attached file), in which I wish to sum the numeric column 'tdiff' based on a specific criteria, e.g. row (1 + 2), row (3 + 4) but not row (11,12,13,14). I have tried these but no luck,

xx<- chaPe [rowSums(1:2, 3:4, 11, 12, 13, 14, 15:16),]
xx<- sum(chaPe $tdiff [c(1:2, 3:4, 11, 12, 13, 14, 15:16)],)

Basically, if you look at the Column 'xsampa', only the numeric values of 'p' and 'A' in Column 'tdiff' need to be summed.

Expected result is, for e.g., row (1 +2), i.e. (0.068 + 0.011) = 0.079. Also, how does the sum affect the values in other columns, presuming they have the same values except the column 'rn' (which is not really important).

I am new to R, thus any help will be great as I cannot figure out this. Thanks.

2

There are 2 answers

7
Ronak Shah On BEST ANSWER

You can create a new group whenever 'p' occurs so that first 2 rows form one group, next 2 another group and row 11:14 as it is. For each group we can sum the sum_tdiff value. For other columns you can decide which values you want to keep. For example, below I keep the first values for column Filename and Place.

library(dplyr)

chaPe %>%
  group_by(grp = cumsum(xsampa == 'p')) %>%
  summarise(sum_tdiff = sum(tdiff), 
            Filename = first(Filename), 
            Place = first(Place)) -> result
2
Agaz Wani On

Another way could be, group the data on Filename, an example is below

library(dplyr)
result <- chaPe %>%
  group_by(Filename) %>%
  summarise(sum = sum(tdiff))
 Filename             sum
  <chr>              <dbl>
1 AK_chape.TextGrid 0.0800
2 DS_chape.TextGrid 0.0844
3 MS_chape.TextGrid 0.0834
4 NS_chape.TextGrid 0.0884
5 PS_chape.TextGrid 0.0838
6 RS_chape.TextGrid 0.0877