How to reshape the result of sklearn's IncrementalPCA (ipca.transform)

28 views Asked by At

I have a huge numpy array that I need to transform it using sklearn's IncrementalPCA. the array is (22, 258186260). So I divided it into both axes like the code below.

First) I'm not sure if it allowed to divide the input using the first and second axes. Or it should be using the first axes only.

Second) If it is correct, how to shape the output from "ipca.transform"

               dset = h5f['test']
               
               ds0, ds1 = dset.shape[0], dset.shape[1]
             
               n = dset .shape[0] # how many rows we have in the dataset
               chunk_size = 2#1000 # how many rows we feed to IPCA at a time, the divisor of n
                  

               for i in range(0, n//chunk_size):# for partial_fit
               
                   for batch in np.split( dset[i*chunk_size : (i+1)*chunk_size] , 10,axis=1):      
                         pca1.partial_fit(batch) 

               my_out_all=[]         
               for i in range(0, n//chunk_size): # for transform
                  
                   for batch in np.split(  dset[i*chunk_size : (i+1)*chunk_size], 10,axis=1):
                         
                         my_out=pca1.transform(batch)  
                         
                         my_out_all.append (my_out) # how to shape the output
0

There are 0 answers