# How to calculate a ratio of two dataframes with unevenly spaced values in R?

I have two datasets in which at several time points a certain value is measured. I would like to visualize the ratio between the two datasets with a line chart. However, the time points of the measurements do not overlap completely and some experiments contain more values than others.

Data

``````df_1 <- as.data.frame(cbind(c(1,2,3,4,5,6,7,8,9,10), c(1,4,7,8,9.5,17,41,27, 26, 10)))
df_2 <- as.data.frame(cbind(c(0.1, 0.5, 2, 3, 4.4,5.3,6,7,8,10,12,15,16,20), c(0.1,2,4,6,7.5,6.3,9,19,22,23,24,51,31,23)))
``````

Plots

``````library(tidyverse)

ggplot()+
geom_line(data = df_1, aes(x=V1, y=V2), col = "black") +
geom_line(data = df_2, aes(x=V1, y=V2), col = "red")
``````

Next, I would like to plot the ratio between the two curves. However, since the measured data-points are not evenly spaced (and measured at slightly different times), I am not able to simply divide the value for each time point and plot the result.

Question

How do you calculate the ratio between two unevenly spaced dataframes (which can be of different lengths) in R?

I have two (example) datasets. The third dataset is the expected output (`df_3`). The green line represents the expected output: the ratio between `df_2`/`df_1`.

``````df_1 <- as.data.frame(cbind(time = c(1:5), value = c(1:5)))
df_2 <- as.data.frame(cbind(time = c(1.5, 2, 3,4), value = c(2,2,2.5,3.5)))
df_3 <- as.data.frame(cbind(time = c(1.5, 2, 3,4), value = c(1.33, 1, 0.83, 0.875)))

ggplot() +
geom_point(data = df_1, aes(x=time, y=value), col = "black", size = 3) +
geom_point(data = df_2, aes(x=time, y=value), col = "red", size =3) +
geom_line(data = df_1, aes(x=time, y = value), col = "black") +
geom_line(data = df_2, aes(x = time, y = value), col = "red") +
geom_point(data =df_3, aes(x = time, y = value), col = "green", size = 3) +
geom_line(data=df_3, aes(x=time, y = value), col = "green")
`````` On Best Solutions

You can do a full join using the `dplyr` package. Note that the data frames I am using are the same as yours but the columns are named `time` and `value`.

Linear interpolation on the ratio

``````df_1 <- as.data.frame(cbind(time = c(1,2,3,4,5,6,7,8,9,10), value = c(1,4,7,8,9.5,17,41,27, 26, 10)))
df_2 <- as.data.frame(cbind(time = c(0.1, 0.5,2,3,4.4,5.3,6,7,8,10,12,15,16,20), value = c(0.1,2,4,6,7.5,6.3,9,19,22,23,24,51,31,23)))

library(dplyr)
df_1 %>% full_join(df_2, by = "time", suffix = c("_1", "_2")) %>%
arrange(time) %>%
mutate(ratio = value_1/value_2,
ratio = approx(time, ratio, xout = time, rule = 2:2)\$y)

time value_1 value_2     ratio
1   0.1      NA     0.1 1.0000000
2   0.5      NA     2.0 1.0000000
3   1.0     1.0      NA 1.0000000
4   2.0     4.0     4.0 1.0000000
5   3.0     7.0     6.0 1.1666667
6   4.0     8.0      NA 1.4074074
7   4.4      NA     7.5 1.5037037
8   5.0     9.5      NA 1.6481481
9   5.3      NA     6.3 1.7203704
10  6.0    17.0     9.0 1.8888889
11  7.0    41.0    19.0 2.1578947
12  8.0    27.0    22.0 1.2272727
13  9.0    26.0      NA 0.8310277
14 10.0    10.0    23.0 0.4347826
15 12.0      NA    24.0 0.4347826
16 15.0      NA    51.0 0.4347826
17 16.0      NA    31.0 0.4347826
18 20.0      NA    23.0 0.4347826
``````

Linear interpolation of value_1 on df_2

``````# Interpolated dataframe 1
# Interpolation ensures that for every time point in df_2, a value_1 is calculated.
# Next, the ratio of value_2 / value_1 is calculated.

int_df_1 <- as.data.frame(approx(df_1\$time, df_1\$value, xout = df_2\$time, rule = 1:1))
names(int_df_1) <- c("time", "value")

# Again full join + division of df_2 by interpolated df_1

int_df_1 %>%
full_join(df_2, by = "time", suffix = c("_1", "_2")) %>%
arrange(time) %>%
mutate(ratio = value_2/value_1)
``````