I have an array of probabilities. I would like the columns to sum to 1 (representing probability) and the rows to sum to X (where X is an integer, say 9 for example).

I thought that I could normalize the columns, and then normalize the rows and times by X. But this didn't work, the resulting sums of the rows and columns were not perfectly 1.0 and X.

This is what I tried:

# B is 5 rows by 30 columns

# Normalizing columns to 1.0
col_sum = []
for col in B.T:
    col_sum.append(sum(col))

for row in range(B.shape[0]): 
    for col in range(B.shape[1]):
        if B[row][col] != 0.0 and B[row][col] != 1.0:
            B[row][col] = (B[row][col] / col_sum[col])

# Normalizing rows to X (9.0)
row_sum = []
for row in B:
    row_sum.append(sum(row))

for row in range(B.shape[0]): 
    for col in range(B.shape[1]):
        if B[row][col] != 0.0 and B[row][col] != 1.0:
            B[row][col] = (B[row][col] / row_sum[row]) * 9.0

2 Answers

0
Alain T. On Best Solutions

This can only work if your matrix's number of columns is X times the number of rows. For example, if X = 3 and you have 5 rows, then you must have 15 columns. So, you could make your 5x30 matrix work for X=6 but not X=9.

The reason for this is that, if each column sums up to 1.0, the total of all values in the matrix will be 1.0 times the number of columns. And since you want each row to sum up to X, then the total of all values must also be X times the number of rows.

So: Columns * 1.0 = X * Rows

If that constraint is met, you only have to adjust all values proportionally to X/sum(row) and both dimensions will work automatically unless the initial values are not properly balanced. If the matrix is not already balanced, adjusting the values would be similar to solving a sudoku (allegedly an NP problem) and the result would largely be unrelated to the initial values. The matrix is balanced when all rows, adjusted to have the same sum, result in all columns having the same sum.

[0.7, 2.1, 1.4, 0.7, 1.4, 1.4, 0.7, 1.4, 1.4, 2.1, 0.7, 2.1, 1.4, 2.1, 1.4] 21
[2.8, 1.4, 0.7, 2.1, 1.4, 2.1, 0.7, 1.4, 2.1, 1.4, 0.7, 0.7, 1.4, 0.7, 1.4] 21
[1.4, 1.4, 1.4, 1.4, 1.4, 1.4, 1.4, 1.4, 1.4, 0.7, 2.8, 0.7, 0.7, 1.4, 2.1] 21
[1.4, 1.4, 1.4, 1.4, 2.1, 1.4, 1.4, 1.4, 0.7, 0.7, 2.1, 1.4, 1.4, 1.4, 1.4] 21
[0.7, 0.7, 2.1, 1.4, 0.7, 0.7, 2.8, 1.4, 1.4, 2.1, 0.7, 2.1, 2.1, 1.4, 0.7] 21

apply x = x * 3 / 21 to all elements ...

[0.1, 0.3, 0.2, 0.1, 0.2, 0.2, 0.1, 0.2, 0.2, 0.3, 0.1, 0.3, 0.2, 0.3, 0.2] 3.0
[0.4, 0.2, 0.1, 0.3, 0.2, 0.3, 0.1, 0.2, 0.3, 0.2, 0.1, 0.1, 0.2, 0.1, 0.2] 3.0
[0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.1, 0.4, 0.1, 0.1, 0.2, 0.3] 3.0
[0.2, 0.2, 0.2, 0.2, 0.3, 0.2, 0.2, 0.2, 0.1, 0.1, 0.3, 0.2, 0.2, 0.2, 0.2] 3.0
[0.1, 0.1, 0.3, 0.2, 0.1, 0.1, 0.4, 0.2, 0.2, 0.3, 0.1, 0.3, 0.3, 0.2, 0.1] 3.0

[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
1
nraw On

I'm not sure if I understood correctly, but it seems like what you're trying to accomplish might mathematically not be feasible?

Imagine you have a 2x2 matrix where you want the rows to sum up to 1 and the columns to 10. Even if you made all the numbers in the columns 1 (their max possible value) you would still not be able to sum them up to 10 in their columns?