I want to find powers of a relatively small matrix, but this matrix consists of rational numbers of type Rational{BigInt}. By default, Julia utilizes only a single thread for such computations. I want to check if using multithreaded matrix multiplication would yield performance gains. How do I do this?
Below is an example of raising 32x32 matrix to the power of four. If I run it on i7-12700k it uses only one thread:
using Random
using LinearAlgebra
Random.seed!(42)
M = BigInt.(rand(Int128, 32, 32)) .// BigInt.(rand(Int128, 32, 32));
BLAS.set_num_threads(8)
@time M^4;
the output is:
19.976082 seconds (1.24 M allocations: 910.103 MiB, 0.19% gc time)
With just Float64 and big matricies I can see Julia correctly uses multiple threads.
N = rand(2^14,2^14)
@time N^4;
32.764584 seconds (1.71 M allocations: 4.113 GiB, 0.08% gc time, 1.14% compilation time)
As noted in comments above, BLAS isn't involved in this at all.
Since I have it, here's a very simple multi-threaded function:
Various packages can write this for you, e.g.
using Einsum; mul(A,B) = @vielsum C[i,k] := A[i,j] * B[j,k]or elseusing Tullio; mul(A,B) = @tullio C[i,k] := A[i,j] * B[j,k] threads=10.Higher powers are much slower, because the numbers involved are larger: