I'm working through Oleg Kiselyov's tutorial Reconciling Abstraction with High Performance: A MetaOCaml approach. One exercise (exercise 23) asks for a let-insertion to bind an array index access to a local variable.
The function of question is vmult_ca, which generates code for multiplying arrays of complex numbers:
let vmult_ca :
(float_complex array -> float_complex array -> float_complex array -> unit)
code =
.<fun vout v1 v2 ->
let n = Array.length vout in
(* vector representations *)
.~(let vout = OVec (.<n>., fun i v ->
.<vout.(.~i) <- .~(of_code_complex v)>.) in
let v1 = Vec (.<n>., fun i ->
of_complex_code .<v1.(.~i)>.) in
let v2 = Vec (.<n>., fun i ->
of_complex_code .<v2.(.~i)>.) in
let module V = VMULT(FloatCodeComplex)(VecDyn) in
V.vmult vout v1 v2)
>.
;;
Where vout is the output vector that store the result.
Vec (n, fun i -> v) is an abstract vector where n is the length and fun i -> v maps each index to a value.
OVec (n, fun i v -> body) is an abstract "output vector" where n is the length and fun i v -> body runs on each index i and the associated output element v at i.
of_complex_code converts a complex code value to a code complex value, e.g. .<{real=1.0, imag=0.0}>. to {real=.<1.0>., imag=.<0.0>.}.
The module VMULT defines (point-wise) vector multiplication (see the code here for details).
When run, vmult_ca generates the following code:
val vmult_ca :
(float_complex array -> float_complex array -> float_complex array -> unit)
code = .<
fun vout_4 ->
fun v1_5 ->
fun v2_6 ->
let n_7 = Array.length vout_4 in
for i_8 = 0 to n_7 - 1 do
vout_4.(i_8) <-
{
Cmplx.im =
(((v1_5.(i_8)).Cmplx.re *. (v2_6.(i_8)).Cmplx.im) +.
((v1_5.(i_8)).Cmplx.im *. (v2_6.(i_8)).Cmplx.re));
Cmplx.re =
(((v1_5.(i_8)).Cmplx.re *. (v2_6.(i_8)).Cmplx.re) -.
((v1_5.(i_8)).Cmplx.im *. (v2_6.(i_8)).Cmplx.im))
}
done>.
Note v1_5.(i_8) is repeated 4 times. The challenge is to insert a let somewhere in vmult_ca to bind v1_5.(i_8) to a local variable to avoid the repetition. I was able to "cheat" by simply calling genlet on .<v1.(~i)>., but I have no clue where to insert the let without genlet; any hint would be appreciated.
Let-insertion is a primitive operation in BER, that automatically binds the passed code to a freshly generated variable.
Here is a working example, suppose you have the code that returns a square of an array element,
and we want to generate an optimized code that has only one array access
In the MetaOCaml style we can use
genletfor thatThe generated code for
will be