I have a 20x20 2D array, from which I want to get for every column the value with the highest count of occurring (excluding zeros) i.e. the value that receives the major vote.
I can do that for a single column like this :
: np.unique(p[:,0][p[:,0] != 0],return_counts=True)
: (array([ 3, 21], dtype=int16), array([1, 3]))
: nums, cnts = np.unique(p[:,0][ p[:,0] != 0 ],return_counts=True)
: nums[cnts.argmax()]
: 21
Just for completeness, we can extend the earlier proposed method to a loop-based solution for 2D arrays -
# p is 2D input array
for i in range(p.shape[1]):
nums, cnts = np.unique(p[:,i][ p[:,i] != 0 ],return_counts=True)
output_per_col = nums[cnts.argmax()]
How do I do that for all columns w/o using for loop ?
We can use
bincount2D_vectorizedto get binned counts per col, where the bins would be each integer. Then, simply slice out from the second count onwards (as the first count would be for0) and getargmax, add1(to compensate for the slicing). That's our desired output.Hence, the solution shown as a sample case run -
That transpose is needed because
bincount2D_vectorizedgets us 2D bincounts per row. Thus, for an alternative problem of getting ranks per row, simply skip that transpose.Also, feel free to explore other options in that linked Q&A to get
2D-bincounts.