Can't get Julia Flux to work for simple linear regression test

Question

Can't get Julia Flux to work for simple linear regression test

458 views Asked by A. Harris At 06 December 2024 at 15:23

I'm a Julia user new to Flux and machine learning. As a first test and to understand how Flux works, I tried using Flux to estimate a simple linear regression model. But clearly I'm doing something wrong, as training the model using train! does not give me the expected OLS coefficients. This surprised me; since linear regression is a simple convex optimization problem, I would have expected gradient descent to quickly converge to the optimum. So I assume I've misunderstood something about how train! works.

Here's my code:

using Flux
using Flux: @epochs
using GLM

# Load data: The features of the Iris data set
features = Flux.Data.Iris.features();
x = features[1:3,:];
y = features[4,:];

J, N = size(x); # number of explanatory variables, number of observations

model = Chain(Dense(J,1)); # define the model

loss(x,y) = Flux.Losses.mse(model(x),y); # define the loss function

function loss_all(X,y) # and define a full-sample loss function
    l = 0;
    for i in 1:length(y)
        l += loss(X[:,i],y[i]);
    end
    return l
end

loss_all(x,y)
@epochs 10000 Flux.train!(loss, params(model), [(x,y)], Descent(0.01)); # train the model
loss_all(x,y)

# How does the result compare to OLS (should be exactly the same)?
x_augmented = vcat(ones(1,N),x);
ols = inv(x_augmented*transpose(x_augmented))*x_augmented*y
y_hat = transpose(x_augmented)*ols;
sse = sum((y_hat - y).^2)

I expect I'm making a silly mistake, but I would really appreciate if someone could help me identify the problem.

Original Q&A

There are 1 answers

**Bogumił Kamiński** · Accepted Answer · 2020-10-18T15:34:23+00:00

The easiest way to fix it is to make sure y has a shape matching model(x):

y = features[4:4,:];

Note that:

Flux.Losses.mse(model(x),y)

expands to:

mean((model(x) .- y).^2)

so model(x) and y should have the same shape (after my fix they are (1,150)). In your original code it was (1,150) vs (150,) which meant that the dimensions got broadcasted to (150,150) after (model(x) .- y).^2.

TechQA.

Can't get Julia Flux to work for simple linear regression test

There are 1 answers

Related Questions in JULIA

Related Questions in FLUX.JL

Popular Questions

Popular Tags

Trending Questions