grouping and summing (evaluating functions) on matrix-values in matlab

1k views Asked by At

Many threads here show that accumarray is the answer in matlab for grouping (and calculating) values by index sets. As this works fast and fine, I need to have somethin similar for bigger (ND) data-fields.
Let's assume an example: We have a names-vector with (non unique) names inside and a data vector of earnings for different projects columnwise

names=str2mat('Tom','Sarah','Tom','Max','Max','Jon');
earnings=[100 200 20;20 100 500;1 5 900; 100 200 200;200 200 0; 300 100 -250];

and now we want to calculate the column-sums for each name.
Okay, we can find out the indices by

[namesuq,i1,i2]=unique(names,'rows')

but after that, the obvious call

accumarray(i2,earning)

is not working. One could of course use a for-loop over the unique names or the rows, but that might be a little bit inefficient. Are there better ideas?
Additionally I tried

accumarray(i2,@(i)earnings(i,:))

but this is not implemented and results in

Error using accumarray
Second input VAL must be a full numeric, logical, or char vector or scalar.

Thanks for ideas.

Additions: Thanks to eitan-t for his solution which is great for the example.
So sad, my minimum working example did not show all needs: The function which I want to aplly needs a whole row or later maybe a complete maxtrix, which I need to group over a 3rd or even higher dimension.

Maybe to make that clearer: Think of a matrix M in size a x b x c and each entry in a corresponds to a name or so. My need is best described for example summing up for all unique names.
Naive programming would be

nam=unique(names);
for ind=1:size(nam,2)
    N(ind,:,:)=sum(M(nam(ind)==names,:,:),1);
end

Is this clear? Are there solutions herefor?

1

There are 1 answers

4
Eitan T On

Based on this answer, the solution you're looking for is:

[namesuq, i1, i2] = unique(names, 'rows');
[c, r] = meshgrid(1:size(earnings, 2), i2);
y = accumarray([r(:), c(:)], earnings(:));

where the rows match names(i1, :).

Example

%// Sample input
names = str2mat('Tom', 'Sarah', 'Tom', 'Max', 'Max', 'Jon');
earnings = [100 200 20; 20 100 500; 1 5 900; 100 200 200; 200 200 0; 300 100 -250]

%// Sum along columns by names 
[namesuq, i1, i2] = unique(names, 'rows');
[c, r] = meshgrid(1:size(earnings, 2), i2);
y = accumarray([r(:), c(:)], earnings(:))

The resulting sums are:

y =
   300   100  -250
   300   400   200
    20   100   500
   101   205   920

which correspond to names(i1, :):

ans =
    Jon  
    Max  
    Sarah
    Tom