I was trying to use Logistic regression in OCaml. I need to use it as a blackbox for another problem I'm solving. I found the following site:
http://math.umons.ac.be/anum/en/software/OCaml/Logistic_Regression/
I pasted the following code (with a few modifications - I defined my own iris_features and iris_label) from this site into a file named logistic_regression.ml:
open Scanf
open Format
open Bigarray
open Lacaml.D
let log_reg ?(lambda=0.1) x y =
(* [f_df] returns the value of the function to maximize and store
its gradient in [g]. *)
let f_df w g =
let s = ref 0. in
ignore(copy ~y:g w); (* g ← w *)
scal (-. lambda) g; (* g = -λ w *)
for i = 0 to Array.length x - 1 do
let yi = float y.(i) in
let e = exp(-. yi *. dot w x.(i)) in
s := !s +. log1p e;
axpy g ~alpha:(yi *. e /. (1. +. e)) ~x:x.(i);
done;
-. !s -. 0.5 *. lambda *. dot w w
in
let w = Vec.make0 (Vec.dim x.(0)) in
ignore(Lbfgs.F.max f_df w);
w
let iris_features = [1 ; 2 ; 3] ;;
let iris_labels = 2 ;;
let proba w x y = 1. /. (1. +. exp(-. float y *. dot w x))
let () =
let sol = log_reg iris_features iris_labels in
printf "w = %a\n" Lacaml.Io.pp_fvec sol;
let nwrongs = ref 0 in
for i = 0 to Array.length iris_features - 1 do
let p = proba sol iris_features.(i) iris_labels.(i) in
printf "Label = %i prob = %g => %s\n" iris_labels.(i) p
(if p > 0.5 then "correct" else (incr nwrongs; "wrong"))
done;
printf "Number of wrong labels: %i\n" !nwrongs
I have the following questions:
- On trying to compile the code, I get the error message: "
Error: Unbound module Lacaml". I've installed Lacaml; done opam init several times, tried to provide a flag -package = Lacaml ; I don't know how to solve this? - As you can see I've defined my own version of iris_features and iris_labels - are the types correct i.e. in the function log_reg is the type of x int list and that of y as int?
Both
iris_featuresandiris_labelsare arrays and array literals in OCaml are delimited with the[|,|]style parentheses, e.g.,The
iris_featuresarray has typevec array, i.e., an array of vectors, not an array of integers, and didn't I dig too deep to know what to put there, but the syntax is the following,The Lacaml interface has changed a bit since the code was written and
axpyno longer accepts labeled~xarguments (both x and y vectors are positional now) so you need to remove~xand fix the order (I presume thatx.(i)isxin thea*x + yexpression andgcorresponds toy, e.g.,This code also depends on
lbfgs, so you need to install it as well,I would suggest you using dune as your default built system but for fast prototyping, you can use
ocamlbuild. Put your code into an empty folder in a file namedregress.ml(you can pick other name, just update the build instructions correspondingly), now you can build it to a native executable, asrun it as
If you're playing in the OCaml toplevel (aka interpreter, i.e., running your code in the
ocamlinterpreter), you can loadlacamlandlbfgsusing the following two directives:(The
#is not a prompt but a part of the directive syntax, so don't forget to type it as well).Now you can copy-paste your code into the interpreter and play with it.
Bonus Track - building with dune
regress.mlthere.open Bigarrayandopen Scanfas dune is very strict on warnings and turns them into errors (and it will warn you on those lines as they are, in fact, unused)