How to redefine cov to calculate population covariance matrix

2.9k views Asked by At

The standard cov function calculates the sample covariance matrix, I want to have the population covariance matrix.

I tried the following:

cov.pop <- function(x,y=NULL) {
  cov(x,y)*(length(x)-1)/length(x)
}

> sapply(list(Apple,HP,Microsoft),cov.pop,y=Apple) #correct
[1] 0.7861672 0.1363396 0.2223303
> sapply(list(Apple,HP,Microsoft),cov.pop,y=HP) #correct
[1] 0.13633964 0.09560376 0.05226032
> sapply(list(Apple,HP,Microsoft),cov.pop,y=Microsoft) #correct
[1] 0.22233028 0.05226032 0.13519964
> cov.pop(cbind(Apple,HP,Microsoft)) #not correct
              Apple         HP  Microsoft
Apple     0.8444018 0.14643887 0.23879919
HP        0.1464389 0.10268552 0.05613145
Microsoft 0.2387992 0.05613145 0.14521443

My question
Is there a simple way to modify the cov.pop function to get the correct population covariance matrix?

1

There are 1 answers

1
akrun On BEST ANSWER

I guess the results are different because the length in the matrix (i.e. cbind(Apple, HP, Microsoft) and the length in each list element is not the same

cov.pop <- function(x,y=NULL) {
   cov(x,y)*(NROW(x)-1)/NROW(x)
  }

Using an example dataset

set.seed(24)
Apple <- rnorm(140)
HP <- rnorm(140)
Microsoft <- rnorm(140)

cov.pop(cbind(Apple,HP,Microsoft)) 
#                Apple          HP  Microsoft
#Apple     0.946489639 0.006511604 0.02518080
#HP        0.006511604 1.015532869 0.04940075
#Microsoft 0.025180805 0.049400745 1.08388185

sapply(list(Apple,HP,Microsoft),cov.pop,y=Apple)
#[1] 0.946489639 0.006511604 0.025180805

sapply(list(Apple,HP,Microsoft),cov.pop,y=HP)
#[1] 0.006511604 1.015532869 0.049400745

sapply(list(Apple,HP,Microsoft),cov.pop,y=Microsoft)
#[1] 0.02518080 0.04940075 1.08388185