Fastest Way to access and put values in matrix

1k views Asked by At

I wrote a program and have been profiling it. The bottlenecks are the following (if I use a sparse matrix):

  26534    0.775    0.000   66.657    0.003 compressed.py:638(__setitem__)
  26534    2.240    0.000   59.438    0.002 compressed.py:688(_set_many)
  13318    2.993    0.000   50.024    0.004 compressed.py:742(_insert_many)
3034231   23.087    0.000   38.101    0.000 defmatrix.py:312(__getitem__)

If I use a dense matrix, then these operations are slow (have to init matrix to zeros)

3034072   24.902    0.000   41.135    0.000 defmatrix.py:312(__getitem__)
  11780   19.586    0.002   19.586    0.002 {numpy.core.multiarray.zeros}

The sparse matrix version is faster (193s versus 178s). But retrieving and putting in rows is clearly a bottleneck for me. I tried using the take function, where I use range() to create an array containing the indices of the row. However that is much worse (by a factor of 10000) than what I am currently doing, which is for matrix X, X[idx,:] for putting and X.getrow(idx).todense() for taking.

Is there a better way to access and replace these rows? My matrices are quite large (~100000 rows 20-500 cols).

Edit: I am using csr_matrix (but open to any kind of sparse matrix - this one seemed suited for grabbing rows). Below is a series of tests just to give a MWE. The speeds are about 3E-4s, 7E-3s, .1s. This is surprising to me and I wonder if there is a faster way to do it than the top approach. If I remove the todense() call the sparse time gets cut in half - but this still seems pretty slow.

import numpy as np
from time import time
from scipy.sparse import csr_matrix

def testFancy(mat,idxs):
    for i in idxs:
        x = mat[i,:]

def testTake(mat,idxs):
    for i in idxs:
        x = mat.take(range(i*50,i*50+50))

def testSparse(mat,idxs):
    for i in idxs:
        x = mat.getrow(i).todense()

mat = np.random.rand(50000,50)
idxs = np.random.randint(50000-1, size=1000)

#fancy
t0 = time()
testFancy(mat,idxs)
t1 = time()
print str(t1-t0)

#take
t0 = time()
testTake(mat,idxs)
t1 = time()
print str(t1-t0)

#sparse
mat = csr_matrix((50000,50))
t0 = time()
testSparse(mat,idxs)
t1 = time()
print str(t1-t0)
2

There are 2 answers

0
user671931 On

I tried using a dok_mat:rix. The result on my test is 0.03s and the overall application is 66s with:

3034124   11.833    0.000   20.293    0.000 defmatrix.py:312(__getitem__)

This seems to be a nice compromise - but I do wonder if it could be better.

2
rth On

Just use fancy indexing, both for getting and setting rows in an array,

import numpy as np
from scipy.sparse import csr_matrix

mat_ds = np.random.rand(50000,50)
mat_csr = csr_matrix((50000,50))
mat_lil = mat_csr.tolil()

idxs = np.random.randint(50000-1, size=1000)

print(mat_sp[idxs, :].todense())
print(mat_csr[idxs, :])

mat_sp[idxs, :] = 2.0 # or any other expression that makes sens here
mat_csr[idxs, :] = 2.0 

it shouldn't matter if the arrays is sparse or not. This will be faster than any custom solution with a loop (~ 250x faster than testSparse in my case).

Of course the assignment to a sparse array should be done in a way to preserve sparsity, or else it will be reallocated, which is costly for csr_matrix. For instance the example above, produces the a warning because of it.

Edit: in response to the comments. Let's consider querying just one row,

In [1]: %timeit -n 1 mat_csr[0, :].todense()
1 loops, best of 3: 101 µs per loop

In [2]: %timeit -n 1 mat_lil[0, :].todense()
1 loops, best of 3: 157 µs per loop

In [3]: %timeit -n 1 mat_ds[0, :]
The slowest run took 8.25 times longer than the fastest. This could mean that an intermediate result is being cached 
1 loops, best of 3: 954 ns per loop

so yes, querying a dense array, is 10 to 100 times faster then sparse array with the result cast as dense (whether you use csr or lil arrays), because there is less overhead. Nothing could be done about it, you just have to make a choice whether you need sparse arrays or not.