For my project, I work with three dimensional MRI data, where the fourth dimension represents different subjects (I use the package nilearn for this). I am using sklearn.decomposition.PCA
to extract a given number of principal components from my data. Now I would like to plot the components separately on a brain image, that is, I would like to show a brain image with my extracted components (in this case, 2) in different colors.
Here’s an example code using the OASIS dataset, which can be downloaded via the nilearn API:
- masking using
nilearn.input_data.NiftiMasker
, which converts my 4 dimensional data into a 2 dimesional array (n_subjects x n_voxels). - standardizing the data matrix using
StandardScaler
- running the PCA using
sklearn.decomposition.PCA
:
## set workspace
import numpy as np
from nilearn.datasets import fetch_oasis_vbm
from nilearn.input_data import NiftiMasker
from nilearn.image import index_img
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline
from nilearn import plotting
## Load Data #################################################################
# take only first 30 subjects as example
oasis_dataset = fetch_oasis_vbm(n_subjects=30)
imgs = np.array(oasis_dataset['gray_matter_maps'])
## PIPELINE ###################################################################
# create a random number generator
rng = np.random.RandomState(42)
# Convert Images to 2D Data Array
niftimasker = NiftiMasker(mask_strategy='template')
# z-standardize images
scaler = StandardScaler()
# Extract 2 Components
pca = PCA(n_components=2,
svd_solver='full',
random_state=rng)
# create pipeline
pipe = Pipeline([('niftimasker',niftimasker),
('scaler',scaler),
('pca',pca)])
# call fit_transform on pipeline
X = pipe.fit_transform(imgs)
As far as I understand what I obtain after running the PCA are the PCA loadings? Unfortunately, I don't understand how to get from this to two images, each containing one PCA component.
To get data back in to image format, you will need to do a NiftiMasker.inverse_transform(). To do so it is required that you preserve the dimensions in voxel space.
So, the way the pipeline is working now, you are using dimensionality reduction on voxel space. Just in case you wanted to reduce dimensionality in subject space, you would change the following:
Then you would apply an inverse transform as follows:
Then to get each individual subject component image you will use index_image from nilearn.image. E.g. this is the image for the first subject component:
However, I think you are interested in reducing diemnsionality on voxel space. Therefore to preserve the voxel dimensions for the inverse transform, you will need to get the index of each voxel feature chosen in the PCA dimensionality reduction. Keep your pipeline the way you had it originally and do the following:
Then tile up nan arrays with x subjects and y voxels: (in your case 30 x 229007)
Then apply the reverse transform on each component:
You will now have 2 images, each with 30 subjects and 1 valid voxel value representing the chosen component. It is up to you how to aggregate the component voxel over the 30 subjects, in this case I am just going to use a mean image function from nilearn.image:
Finally, in both cases plot the respective image. In the voxel reduced version you will see a small variation in the two images in the X dimension (second diagram) but hardly the Y and Z. I'm using plot_glass_brain from nilearn.plotting:
To use overlays, adjust color maps to make this easier to visualize, and other plotting options consult this and other nilearn plotting guides:
https://nilearn.github.io/plotting/index.html#different-display-modes
Let me know if you have any more questions.