I'm trying to use Optim in Julia to solve a two variable minimization problem, similar to the following
x = [1.0, 2.0, 3.0]
y = 1.0 .+ 2.0 .* x .+ [-0.3, 0.3, -0.1]
function sqerror(betas, X, Y)
err = 0.0
for i in 1:length(X)
pred_i = betas[1] + betas[2] * X[i]
err += (Y[i] - pred_i)^2
end
return err
end
res = optimize(b -> sqerror(b, x, y), [0.0,0.0])
res.minimizer
I do not quite understand what [0.0,0.0] means. By looking at the document http://julianlsolvers.github.io/Optim.jl/v0.9.3/user/minimization/. My understanding is that it is the initial condition. However, if I change that to [0.0,0., 0.0], the algorithm still work despite the fact that I only have two unknowns, and the algorithm gives me three instead of two minimizer. I was wondering if anyone knows what[0.0,0.0] really stands for.
It is initial value.
optimizeby itself cannot know how many values yoursqerrorfunction takes. You specify it by passing this initial value.For example if you add dimensionality check to
sqerroryou will get a proper error:Note that I also changed the loop condition to
eachindex(X, Y)to ensure that your function checks ifXandYvectors have aligned indices.Finally if you want performance and reduce compilation cost (so e.g. assuming you do this optimization many times) it would be better to define your optimized function like this: