How to calculate correlation between two series with high autocorrelation

20 views Asked by At

two series with high autocorrelation, so the degree of freedom is decline. Using t-test to estimate the p-value, they are easy to achieve a high significance level. How to adjusted the correlation coefficient that considers their autocorrelation (degree of freedom)? Or, estimate the significance of p-value? In terms of later, I know a solution than calculate effective sample size, nadj = n.*(1-r1.*r2)./(1+r1.*r2); the r1 and r2 are the lag1 autocorrelations of the two series. This solution calculated a very low effective sample size for a high autocorrelation. make it impossible to correlation. But from the plot, I think the two series are correlate significantly. For a better solution, I want to know if there is some correlation coefficient that adjusted for autocorrelation series? Or, another methods for estimating correlation of two sereies that highly autocorrelated. I have a two series with length of 99, the r= 0.5282, p=3.19e-8, but they are autocorrelated. Acoording to the effective sample size, which is 2.3247, the adjust p value is 0.845. I think the two series are correlated, but after consider the autocorrelation, they are not correlated. enter image description here

The following is matlab code of an example data, with a shorter length.

data1 =[-0.2433   -0.2509   -0.2539   -0.2522   -0.2462   -0.2358   -0.2214   -0.2032   -0.1817   -0.1571   -0.1297   -0.1000   -0.0682   -0.0346    0.0005    0.0368    0.0740    0.1120    0.1504    0.1891];
data2 =[ -1.0213   -1.0088   -0.9919   -0.9707   -0.9452   -0.9155   -0.8816   -0.8437   -0.8021   -0.7570   -0.7088   -0.6577   -0.6041   -0.5487   -0.4917   -0.4339   -0.3757   -0.3176   -0.2604   -0.2046];
[rho,pval] = corr(data1', data2');  % rho = 0.9933, pval = 2.5766e-18

n = length(data1);
r1 = autocorr(data1,1);
r1 = r1(2);
r2 = autocorr(data2,1);
r2 = r2(2);
nadj = n.*(1-r1.*r2)./(1+r1.*r2);  % nadj is effective sample size, nadj = 3.0023

t_stat = rho * sqrt((nadj - 2) / (1 - rho^2));
padj = 2 * (1 - tcdf(abs(t_stat), nadj - 2));   % considering effective sample size,according to t-test, the padj = 0.0734
0

There are 0 answers