Why is this simple indexing variable gettng Box-ed in Julia?

417 views Asked by At

In solving Problem 10 from Project Euler manually (instead of simply using Primes), I implemented a naive Sieve of Eratosthenes in Julia using BitVector. It produces the correct output, but when I checked its type stability using @code_warntype, I found that i was being allocated as Core.Box, which affects the type of psum, and that in turn throws the type stability of the whole function away. This is the code used:

function primesum(limit::T = 2_000_000) where T<:Integer
    #easy way: using Primes; sum(primes(limit)) 

    #hard way: 
    is_prime_num = BitVector(limit)
    is_prime_num .= true

    psum = zero(limit)
    for i in 2:limit
        if is_prime_num[i]
            psum += i
            multiples_of_i = [k*i for k in 2:(limit÷i)]
            is_prime_num[multiples_of_i] .= false
        end
    end
    psum
end

And here is the Variables part of the code_warntype output (full output here):

Variables:
  #self#::#primesum
  limit::Int64
  #46::##46#47
  i::Core.Box
  multiples_of_i::Any
  #temp#@_6::Int64
  is_prime_num::BitArray{1}
  psum::Any
  J::Any
  #temp#@_10::Any

Overall, a lot of types in the code (including the function's return type) are left as either Any or a where _ type.

I was able to improve speed (almost 10x) and memory (~3x) by changing the for loop line to for i::T in 2:limit - then, i still gets set as a Core.Box, but that gets typeasserted to Int64 and doesn't propagate the instability to psum and others. But I'm more interested in why Julia couldn't infer i's type than in speeding up this specific code. I'm used to type instabilities making sense at least in retrospect, but this one seems pretty clear and simple to infer, so I'd like to know where there is type ambiguity here.

1

There are 1 answers

3
Bogumił Kamiński On BEST ANSWER

A temporary solution is to wrap i in let block around the comprehension (I have also proposed some small tweaks in your code that are minor cleanups - I have left creation of multiples_of_i as this was a core of your question but actually using this variable is also inefficient - it would be better to set is_prime_num vector to false in appropriate places using a loop):

function primesum(limit::Integer = 2_000_000)
    is_prime_num = trues(limit)
    psum = zero(limit)
    for i in 2:limit
        if is_prime_num[i]
            psum += i
            let i=i
                multiples_of_i = [k*i for k in 2:(limit÷i)]
                is_prime_num[multiples_of_i] .= false
            end
        end
    end
    psum
end

Hopefully in the long run the issue https://github.com/JuliaLang/julia/issues/15276 will be fixed and this will not be needed.