I was trying to use Logistic regression in OCaml. I need to use it as a blackbox for another problem I'm solving. I found the following site:
http://math.umons.ac.be/anum/en/software/OCaml/Logistic_Regression/
I pasted the following code (with a few modifications - I defined my own iris_features and iris_label) from this site into a file named logistic_regression.ml:
open Scanf
open Format
open Bigarray
open Lacaml.D
let log_reg ?(lambda=0.1) x y =
(* [f_df] returns the value of the function to maximize and store
its gradient in [g]. *)
let f_df w g =
let s = ref 0. in
ignore(copy ~y:g w); (* g ← w *)
scal (-. lambda) g; (* g = -λ w *)
for i = 0 to Array.length x - 1 do
let yi = float y.(i) in
let e = exp(-. yi *. dot w x.(i)) in
s := !s +. log1p e;
axpy g ~alpha:(yi *. e /. (1. +. e)) ~x:x.(i);
done;
-. !s -. 0.5 *. lambda *. dot w w
in
let w = Vec.make0 (Vec.dim x.(0)) in
ignore(Lbfgs.F.max f_df w);
w
let iris_features = [1 ; 2 ; 3] ;;
let iris_labels = 2 ;;
let proba w x y = 1. /. (1. +. exp(-. float y *. dot w x))
let () =
let sol = log_reg iris_features iris_labels in
printf "w = %a\n" Lacaml.Io.pp_fvec sol;
let nwrongs = ref 0 in
for i = 0 to Array.length iris_features - 1 do
let p = proba sol iris_features.(i) iris_labels.(i) in
printf "Label = %i prob = %g => %s\n" iris_labels.(i) p
(if p > 0.5 then "correct" else (incr nwrongs; "wrong"))
done;
printf "Number of wrong labels: %i\n" !nwrongs
I have the following questions:
- On trying to compile the code, I get the error message: "
Error: Unbound module Lacaml
". I've installed Lacaml; done opam init several times, tried to provide a flag -package = Lacaml ; I don't know how to solve this? - As you can see I've defined my own version of iris_features and iris_labels - are the types correct i.e. in the function log_reg is the type of x int list and that of y as int?
Both
iris_features
andiris_labels
are arrays and array literals in OCaml are delimited with the[|
,|]
style parentheses, e.g.,The
iris_features
array has typevec array
, i.e., an array of vectors, not an array of integers, and didn't I dig too deep to know what to put there, but the syntax is the following,The Lacaml interface has changed a bit since the code was written and
axpy
no longer accepts labeled~x
arguments (both x and y vectors are positional now) so you need to remove~x
and fix the order (I presume thatx.(i)
isx
in thea*x + y
expression andg
corresponds toy
, e.g.,This code also depends on
lbfgs
, so you need to install it as well,I would suggest you using dune as your default built system but for fast prototyping, you can use
ocamlbuild
. Put your code into an empty folder in a file namedregress.ml
(you can pick other name, just update the build instructions correspondingly), now you can build it to a native executable, asrun it as
If you're playing in the OCaml toplevel (aka interpreter, i.e., running your code in the
ocaml
interpreter), you can loadlacaml
andlbfgs
using the following two directives:(The
#
is not a prompt but a part of the directive syntax, so don't forget to type it as well).Now you can copy-paste your code into the interpreter and play with it.
Bonus Track - building with dune
regress.ml
there.open Bigarray
andopen Scanf
as dune is very strict on warnings and turns them into errors (and it will warn you on those lines as they are, in fact, unused)