How to slice and calculate the pearson correlation coefficient between one big and small array with "overlapping" windows arrays

Question

How to slice and calculate the pearson correlation coefficient between one big and small array with "overlapping" windows arrays

84 views Asked by mad At 24 January 2023 at 12:42

Suppose I have two very simple arrays with numpy:

import numpy as np
reference=np.array([0,1,2,3,0,0,0,7,8,9,10])
probe=np.zeros(3)

I would like to find which slice of array reference has the highest pearson's correlation coefficient with array probe. To do that, I would like to slice the array reference using some sort of sub-arrays that are overlapped in a for loop, which means I shift one element at a time of reference, and compare it against array probe. I did the slicing using the non elegant code below:

from statistics import correlation
for i in range(0,len(reference)):
  #get the slice of the data 
  sliced_data=reference[i:i+len(probe)]
  #only calculate the correlation when probe and reference have the same number of elements 
  if len(sliced_data)==len(probe):
      my_rho = correlation(sliced_data, probe)

I have one issues and one question about such a code:

1-once I run the code, I have the error below:

my_rho = correlation(sliced_data, probe)
  File "/usr/lib/python3.10/statistics.py", line 919, in correlation
    raise StatisticsError('at least one of the inputs is constant')
statistics.StatisticsError: at least one of the inputs is constant

2- is there a more elegant way of doing such slicing with python?

Original Q&A

There are 1 answers

**mozway** · Accepted Answer · 2023-01-24T13:10:18+00:00

You can use sliding_window_view to get the successive values, for a vectorized computation of the correlation, use a custom function:

from numpy.lib.stride_tricks import sliding_window_view as swv

def np_corr(X, y):
    # adapted from https://stackoverflow.com/a/71253141
    denom = (np.sqrt((len(y) * np.sum(X**2, axis=-1) - np.sum(X, axis=-1) ** 2)
                       * (len(y) * np.sum(y**2) - np.sum(y)**2)))
    return np.divide((len(y) * np.sum(X * y[None, :], axis=-1) - (np.sum(X, axis=-1) * np.sum(y))),
                     denom, where=denom!=0
                    )

corr = np_corr(swv(reference, len(probe)), probe)

Output:

array([ 1.        ,  1.        , -0.65465367, -0.8660254 ,  0.        ,
        0.8660254 ,  0.91766294,  1.        ,  1.        ])

TechQA.

How to slice and calculate the pearson correlation coefficient between one big and small array with "overlapping" windows arrays

There are 1 answers

Related Questions in PYTHON

Related Questions in ARRAYS

Related Questions in NUMPY

Related Questions in PEARSON-CORRELATION

Related Questions in PEARSON

Popular Questions

Trending Questions