How to calculate the 3x3 covariance matrix for RGB values across an image dataset?

Question

How to calculate the 3x3 covariance matrix for RGB values across an image dataset?

2.3k views Asked by ProGamerGov At 07 January 2025 at 17:59

I need to calculate the covariance matrix for RGB values across an image dataset, and then apply Cholesky decomposition to the final result.

The covariance matrix for RGB values is a 3x3 matrix M, where M_(i, i) is the variance of channel i and M_(i, j) is the covariance between channels i and j.

The end result should be something like this:

([[0.26, 0.09, 0.02],
[0.27, 0.00, -0.05],
[0.27, -0.09, 0.03]])

I'd prefer to stick to PyTorch functions even though Numpy has a Cov function.

I attempted to recreate the numpy Cov function in PyTorch here based on other cov implementations and clones:

def pytorch_cov(tensor, tensor2=None, rowvar=True):
    if tensor2 is not None:
        tensor = torch.cat((tensor, tensor2), dim=0)
    tensor = tensor.view(1, -1) if tensor.dim() < 2 else tensor
    tensor = tensor.t() if not rowvar and tensor.size(0) != 1 else tensor
    tensor = tensor - torch.mean(tensor, dim=1, keepdim=True)
    return 1 / (tensor.size(1) - 1) * tensor.mm(tensor.t())

def cov_vec(x):
    c = x.size(0)
    m1 = x - torch.sum(x, dim=[1],keepdims=True)/ c
    out = torch.einsum('ijk,ilk->ijl',m1,m1)  / (c - 1)
    return out

The dataset loading would be like this:

dataset = torchvision.datasets.ImageFolder(data_path)
loader = torch.utils.data.DataLoader(dataset)

for images, _ in loader:
    batch_size = images.size(0) 
    ...

For the moment I'm just experimenting with images created with torch.randn(batch_size, 3, height, width).

Edit:

I'm attempting to replicate the matrix from Tensorflow's Lucid here, and somewhat explained on distill.pub here.

Second Edit:

In order to make the output resemble the example one, you have to do this instead of using Cholesky:

rgb_cov_tensor = rgb_cov_tensor / len(loader.dataset)
U,S,V = torch.svd(rgb_cov_tensor)
epsilon = 1e-10
svd_sqrt = U @ torch.diag(torch.sqrt(S + epsilon))

The resulting matrix can then be used to perform color decorrelation, which is useful for visualizing features (DeepDream). I've implemented it in my project here.

Original Q&A

There are 1 answers

**Gil Pinsky** · Accepted Answer · 2020-09-22T18:44:59+00:00

Here is a function for computing the (unbiased) sample covariance matrix on a 3 channel image, named rgb_cov. Cholesky decomposition is straightforward with torch.cholesky:

import torch
def rgb_cov(im):
    '''
    Assuming im a torch.Tensor of shape (H,W,3):
    '''
    im_re = im.reshape(-1, 3)
    im_re -= im_re.mean(0, keepdim=True)
    return 1/(im_re.shape[0]-1) * im_re.T @ im_re

#Test:
im = torch.randn(50,50,3)
cov = rgb_cov(im)
L_cholesky = torch.cholesky(cov)

TechQA.

How to calculate the 3x3 covariance matrix for RGB values across an image dataset?

There are 1 answers

Related Questions in PYTHON

Related Questions in DATASET

Related Questions in PYTORCH

Related Questions in COVARIANCE-MATRIX

Related Questions in LUCID

Popular Questions

Popular Tags

Trending Questions