Is there a way to force Halide not to generate code which use vector instructions?

463 views Asked by At

We have implemented few algorithms using Halide language which uses arctan like trigonometric functions. But for the instrumentation purposes we want to force Halide not to generated vector instructions.

We are using visual c++ in windows and cl compiler in Visual Studio 2013 tool chain. So far trying to force cl using /arch:IA32 but it still generate vector instructions.

Is there a way to force this somehow from Halide language side or any way to intercept math library calls and there we can ask Halide to use arctan functions written by us which are not optimized to use vector instructions.

2

There are 2 answers

1
Zalman Stern On

Generally Halide will not generate any code for atan and the implementation will come from the system math library (libm). (This is not true for all math routines as we provide internal implementations for some, but usually this is made explicit via names such as fast_log, fast_exp, etc.) To override this, you would generally provide your own implementation of libm or atan (and atan2, etc.), but Halide may allow you to define atan_f32 and atan_f64 to do the override. This may be advantageous as those should be declared with weak linkage, though that likely does not work on Windows. You could also change the definitions of these routines in src/runtime/posix_math.ll to point to your own.

In general Halide will only generate vectorized code if the schedule says to do so. However, llvm has automatic vectorization passes that can generate vector instructions. On x86_64, the SIMD instructions will generally be used for scalar floating-point computation. On 32-bit x86, if you do not turn on any of the x86 SIMD flags in the Target (e.g. none of SSE41, AVX, etc.) then we should set the llvm target machine to disallow SIMD instructions entirely. But that will not affect stuff in libm unless you take measures to do so at final link time.

You can also use HalideExtern to declare a call to a routine of your own choosing and use that instead of atan.

0
Ashish Uthama On

You ought to be able to set the target to be, say, host-x86-64 which should prevent Halide from using any vectorization (i.e using sse4/avx* instructions).

If you are using AOT with generators, look at: http://halide-lang.org/tutorials/tutorial_lesson_15_generators_usage.html The my_first_generator_basic should not be using any SIMD instructions.

Not too familar with JIT, but this example shows how to set the target while JITing: https://github.com/halide/Halide/wiki/Minimal-GPU-example You should be able to use a similar approach to specify the target as x86-64.