I have a design matrix I'm converting in a sparse matrix using the scipy module
It have many rows and only few columns
With this shape, is it better to use the CSC or the CSR design ? Or are they strictly equivalent for the execution speed ?
Basically, it looks like this example : (But there is many more rows in the true one)
Thanks !
You can readily convert one format to the other (
.tocsc()
,.tocsr()
). In factM.T
for acsr
just creates acsc
with the same data.In a number of cases
sparse
functions convert a matrix to another format to perform certain actions. In other cases it gives an 'efficiency' warning if the format isn't the best. (beware, warnings appear only once per run.)If you are iterating over columns, or selecting mostly by column,
csc
is better with converse true forcsr
. For math, matrix products and such, they are equivalent.Create the matrix one way, and do a few timing tests for typical operations.