I have 3 large data frames that look like this:
library(tibble)
df1 <- tibble(peak=c("peak1","peak2","peak3"),
coord1=c(100,500,1000),
coord2=c(250,700,1250))
df2 <- tibble(peak=c("peak5","peak6","peak7"),
coord1=c(120,280,900),
coord2=c(300,400,1850))
df3 <- tibble(peak=c("peak8","peak9","peak10"),
coord1=c(900,3000,5600),
coord2=c(2000,3400,5850))
df1
#> # A tibble: 3 × 3
#> peak coord1 coord2
#> <chr> <dbl> <dbl>
#> 1 peak1 100 250
#> 2 peak2 500 700
#> 3 peak3 1000 1250
df2
#> # A tibble: 3 × 3
#> peak coord1 coord2
#> <chr> <dbl> <dbl>
#> 1 peak5 120 300
#> 2 peak6 280 400
#> 3 peak7 900 1850
df3
#> # A tibble: 3 × 3
#> peak coord1 coord2
#> <chr> <dbl> <dbl>
#> 1 peak8 900 2000
#> 2 peak9 3000 3400
#> 3 peak10 5600 5850
I am relative new to R and I am trying to find the overlapping area within coordinates (coord1, coord2) that are unique to each data frame, overlap between two data frames, and overlap within all data frames.
I want these data frames as an ouptut. At the moment Its hard for me to find how to specify in R, dplyr that I want to filter based on the overlapping ranges. There is a command that I am missing
unique the ranges of these peaks do not overlap with the ranges of peaks of other data frames
> unique
peak coord1 coord2
peak6 280 400
peak9 3000 3400
peak10 5600 5850
common between df1-df2
>df1df2
peak coord1 coord2
peak1 100 250
peak5 120 300
peak3 1000 1250
peak7 900 1850
common between df1-df3
peak coord1 coord2
peak3 1000 1250
peak8 900 2000
and then common between df1-df2-df3
To be honest, I don't understand which is the final goal of your search. At any rate, there is a solution that uses tidyverse approach and functions from
ivs
package in order to check vectors' intervals. It is not an elegant solution, and it does not consider overlapping vectors within the same data frame.Your data
Use of function
iv_overlaps
in order to create intervalsBind dataframes
Check overlapping intervals between dataframes
Check non overlapping interval between dataframes