Description of my goal and the problem:

I am currently working with the scanpy and anndata packages in python (version 3.6.12). If you are not familiar with these packages, just know that the anndata.AnnData stores a data matrix (numpy.ndarray) in the attribue X. The columns of X are described in a panda.DataFrame in the var attribute and the rows are described in the obs attribute. See this link for more information

My goal is to have a class (e.g. class A) that is inherited from the anndata.AnnData class. In this child class, I want to implement some processing methods to filter out certain row or columns for example. More importantly, I also want these methods to modify the attributes in a "inplace" manner, without needing to return a copy (i.e. without return self). However, when I remove the return self in the methods, the instances of the class are not modified. More precisely, the self is modified inside the function, but the instance of the class remains unchanged.

Code example:

Imagine the following example with class A inherited from anndata.AnnData. The class A has one method called remove_last_row() that removes the last row (i.e. obs) of the instance of the class A.

import anndata
import numpy as np

class A(anndata.AnnData): 
    
    def __init__(self, adata, data_type=None): 
        """
        Initition method 
        
        Parameters: 
        -----------
        adata: anndata.AnnData, 
            The Anndata object
        """
        super().__init__(adata)
        
    def remove_last_row(self): 
        """
        Remove the last row of the anndata object
        """
        
        print("--> In A.remove_last_row() method:")
        print("before filtering: number rows = ", self.X.shape[0])
        
        # get the row index to keep (i.e. the index of the obs without the last one)
        index_to_keep = self.obs[:-1].values.astype(int)
        # Keep only those index: 
        self = self[index_to_keep, :]
        
        print("after filtering: number rows = ", self.X.shape[0])
        print("<-- exit A.remove_last_row() method.")

The problem when using the remove_last_row() method is that the row of the instance (self) is removed inside the function but it does not modify the instance of the class. See the example bellow:

# Create an AnnData object: 
adata = anndata.AnnData(np.array([[0, 0, 0], [1, 1, 1], [2, 2, 2]]))
    
# Create object A that is inherited from AnnData
obj_A = A(adata = adata)
    
# Test remove_last_row method
obj_A.remove_last_row()

print()
print("obj_A.X attribute = \n", obj_A.X)

Which results in:

--> In A.remove_last_row() method:
before filtering: number rows = 3
after filtering: number rows = 2
<-- exit A.remove_last_row() method.

obj_A.X attribute =
[[0. 0. 0.]
[1. 1. 1.]
[2. 2. 2.]]

We see that in the remove_last_row() method, the last row is removed in self inside the function. However, the obj_A (instance of class A) is not modified by this method. How can I solve this without adding the return self.


Additional information:

  • Python version=3.6.12
  • numpy version=1.19.1
  • anndata version=0.7.4
  • scanpy version=1.6.0

I have also tested a method that I have called addition() which adds a certain value to every element of the array X. With this method, I do not suffer from this problem.

If the method addition() is in the class A:

   def addition(self, x=1): 
        """
        Add a value of x for each element in the X numpy array in the AnnData object
        
        Parameters: 
        -----------
        x: float,
            The value added to every element 
        """
        self.X += x

We can test:

# Create an AnnData object: 
adata = anndata.AnnData(np.array([[0, 0, 0], [1, 1, 1], [2, 2, 2]]))

# Create object A that is inherited from AnnData
print("Before addition")
obj_A = A(adata = adata)
print("obj_A.X attribute = \n", obj_A.X)

# Test the addition method
print()
obj_A.addition(x=1)
print()
print("After addition")
print("obj_A.X attribute = \n", obj_A.X)

The results are:

Before addition
obj_A.X attribute =
[[0. 0. 0.]
[1. 1. 1.]
[2. 2. 2.]]

After addition
obj_A.X attribute =
[[1. 1. 1.]
[2. 2. 2.]
[3. 3. 3.]]

As you can see the addition() method worked. It was able to modify the instance of the class.

0

There are 0 answers