Is there a statistical method that can quantify similarity between two curves on different axes (with propagation of error)?

23 views Asked by At

Is there an operation we can use that would overlay and stretch these curves optimally (given they are on different axes), and then provide a metric to quantify the “similarity” between these two curves? Simple x-y correlation loses the time series covariation. Perhaps something like a minimum Euclidian distance? Also, given the raw data are kind of noisy, the raw data are fitted with GAM functions (which are the curves you see below), rather than using the "point to point" trends. However these come with a level of uncertainty (i.e. the 95% confidence intervals, indicated by the area fill). Is there a way of propagating this uncertainty in the "similarity" metric? Perhaps a similarity metric with a +/- value that takes the variation of the GAM functions into account?

Various curves needing to be compared statistically for their degree of similarity.

python code used to make plots. These are geology-geochemistry plots, where depth is treated like a time series, reading from bottom (oldest) to top (youngest). Apologies for how long the code is, I am not an expert coder.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from pygam import LinearGAM, s, f

# Read data
data = pd.read_csv("Geochemical data.csv", encoding="latin1")
data["Depth"] = pd.to_numeric(data["Depth"])

# Columns to convert to numeric
columns_to_convert = ["proxy_1", "Depth", "proxy_2"]

# Convert columns to numeric, replacing non-numeric values with NaN
data[columns_to_convert] = data[columns_to_convert].apply(pd.to_numeric, errors='coerce')

# Drop rows with NaN values in the specified columns
data = data.dropna(subset=columns_to_convert)

# Filter data by drillcore and remove NaN values
data_filtered = data[(data['Drillcore_ID'] == 'drillcore example') & (~data['proxy_1'].isna()) & (~data['proxy_2'].isna())]

# Set the figure and subplot
fig, ax1 = plt.subplots(figsize=(4, 12))

# Fit GAM models
model_proxy_1 = LinearGAM(s(0, n_splines=20, lam=0.6)).fit(data_filtered['Depth'], data_filtered['proxy_1'])
model_proxy_2 = LinearGAM(s(0, n_splines=20, lam=0.6)).fit(data_filtered['Depth'], data_filtered['proxy_2'])

# Generate Depth values for plotting the models
depth_values = np.linspace(min(data_filtered['Depth']), max(data_filtered['Depth']), num=100)

# Predict values using the GAM models
predictions_proxy_1 = model_proxy_1.predict(depth_values)
predictions_proxy_2 = model_proxy_2.predict(depth_values)

# Set the style
plt.style.use('ggplot')

# Add secondary x-axis for CA_Hop variable
ax2 = ax1.twiny()
ax2.plot(data_filtered['proxy_2'], data_filtered['Depth'], 'o', markersize=3, color="#FF0000")
ax2.plot(predictions_proxy_2, depth_values, '--', linewidth=0.75, color="#FF0000")
ax2.set_xlabel(r'Proxy 2', color="#FF0000")
ax2.tick_params(axis='x', labelcolor="#FF0000")
ax2.set_xlim(25, 75)

# Plot results for d15N
ax1.plot(data_filtered['proxy_1'], data_filtered['Depth'], 'o', markersize=3, color="#30638E")
ax1.plot(predictions_proxy_1, depth_values, '-', linewidth=0.75, color="#30638E")
ax1.set_ylabel("Depth (m)")
ax1.set_xlabel(r'Proxy 1', color="#30638E")
ax1.tick_params(axis='x', labelcolor="#30638E")
ax1.set_xlim(8.5, 4.5)

# Generate confidence intervals for the GAM models
confidence_intervals_proxy_2 = model_proxy_2.confidence_intervals(depth_values, width=0.95)
confidence_intervals_proxy_1 = model_proxy_1.confidence_intervals(depth_values, width=0.95)

# Plot confidence intervals
ax2.fill_betweenx(depth_values, confidence_intervals_proxy_2[:, 0], confidence_intervals_proxy_2[:, 1], alpha=0.2, color="#FF0000")
ax1.fill_betweenx(depth_values, confidence_intervals_proxy_1[:, 0], confidence_intervals_proxy_1[:, 1], alpha=0.2, color="#30638E")

# Customize the plot
for ax in [ax1, ax2]:
    ax.set_xlabel(r'Proxy 1', color="#30638E")
    ax.tick_params(axis='x', labelcolor="#30638E")
    ax.set_xlim(8.5, 4.5)
    ax.set_ylim(590, 260)

    ax.set_facecolor('white')
    ax.spines['bottom'].set_color('black')
    ax.spines['left'].set_color('black')
    ax.spines['top'].set_color('black')
    ax.spines['right'].set_color('black')

    ax.xaxis.grid(color='#EAEAEA')
    ax.yaxis.grid(color='#EAEAEA')
    ax.grid(True, which='minor', color='#EAEAEA', linestyle='--')

# Set secondary x-axis limits
ax2.set_xlabel(r'Proxy 2', color="#FF0000")
ax2.tick_params(axis='x', labelcolor="#FF0000")
ax2.set_xlim(25, 75)

plt.show()

# Save figures
fig.savefig("Geochemical example.png", dpi=300, bbox_inches='tight', format='png', width=6, height=2)

Geochemical data can be found below, apologies the heading have shifted to the left.

Sample_ID   Drillcore_ID    Depth   proxy_1 proxy_2 proxy_3
Sample 1    drillcore example   271.95  NaN NaN 0.866
Sample 2    drillcore example   275.76  NaN NaN 0.786
Sample 3    drillcore example   279.35  NaN NaN NaN
Sample 4    drillcore example   289.85  NaN NaN 0.394
Sample 5    drillcore example   295.3   NaN NaN 0.745
Sample 6    drillcore example   313.43  5.5 59.5    1.429
Sample 7    drillcore example   330.4   NaN NaN NaN
Sample 8    drillcore example   338.8   6   73.3    0.926
Sample 9    drillcore example   341.1   5.8 50  1.208
Sample 10   drillcore example   365.2   6.6 72.4    1.844
Sample 11   drillcore example   371.76  7.5 71.3    0.799
Sample 12   drillcore example   376.4   6.8 56.7    1.354
Sample 13   drillcore example   382.22  NaN NaN 1.261
Sample 14   drillcore example   393.7   10.9    23.5    0.223
Sample 15   drillcore example   402.4   6.5 59.4    0.303
Sample 16   drillcore example   406.7   6.2 74.2    0.401
Sample 17   drillcore example   414.75  6.5 79.4    0.408
Sample 18   drillcore example   423.1   NaN NaN 0.497
Sample 19   drillcore example   429.7   6.7 81.5    0.445
Sample 20   drillcore example   443.7   7.1 62.3    0.657
Sample 21   drillcore example   450 7.1 75.8    0.758
Sample 22   drillcore example   452.1   6.2 56.1    0.783
Sample 23   drillcore example   457.2   6.6 68.9    0.631
Sample 24   drillcore example   461.83  7.8 65.7    0.802
Sample 25   drillcore example   465.3   5.9 47.6    0.74
Sample 26   drillcore example   470.7   6.3 69.8    0.878
Sample 27   drillcore example   474.4   6.6 60.6    0.927
Sample 28   drillcore example   477 7.1 64.8    0.826
Sample 29   drillcore example   478.2   5.9 53.4    0.751
Sample 30   drillcore example   485 6.5 71.2    0.971
Sample 31   drillcore example   489 6.4 72.1    0.948
Sample 32   drillcore example   492.98  NaN NaN 1.049
Sample 33   drillcore example   494.5   5.6 49.1    0.925
Sample 34   drillcore example   494.6   6.5 66  0.86
Sample 35   drillcore example   498.88  5.6 63.9    1.075
Sample 36   drillcore example   506 5.1 70.5    1.201
Sample 37   drillcore example   515.35  5.4 68.7    1.087
Sample 38   drillcore example   522.68  5.4 85.5    0.995
Sample 39   drillcore example   527.32  5.5 63  0.892
Sample 40   drillcore example   542.61  6.2 69.4    0.801
Sample 41   drillcore example   552.62  6.2 37.3    0.625
Sample 42   drillcore example   558.9   8   41.4    1.369
Sample 43   drillcore example   561.5   6   33.8    0.465
Sample 44   drillcore example   577 6.6 56.7    4.55
Sample 45   drillcore example   578.5   7.4 56.5    2.707
0

There are 0 answers