perl regression without intercept

229 views Asked by At

I am trying to achieve linear regression in perl using Statistics::Regression module without an INTERCEPT. How can I achieve a regression model without having an intercept? I am getting correct results with an intercept using the following code:

#!/usr/bin/perl
use strict;
use warnings;
use Statistics::Regression

my @row=();
my $reg=Statistics::Regression->new("sample regression",["const", "x"]);

open(my $f1, "<","ema_bid_44671_11536") or 
        die "cant open ema_bid_44671_11536";

while(my $line=<$f1>){
    my @row=split(",",$line);
    chomp($row[2]);
    chomp($row[1]);
    $reg->include($row[2],[ 1.0, $row[1]]); 
}

$reg->print();
close $f1;

But when I change the 1.0 to 0.0 in last statement inside while loop ( to include zero constant) and the code looks like:

#!/usr/bin/perl
use strict;
use warnings;
use Statistics::Regression

my @row=();
my $reg=Statistics::Regression->new("sample regression",["const", "x"]);


open(my $f1, "<","ema_bid_44671_11536") or 
        die "cant open ema_bid_44671_11536";

while(my $line=<$f1>){
    my @row=split(",",$line);
    chomp($row[2]);
    chomp($row[1]);
    $reg->include($row[2],[ 0.0, $row[1]]); 
}

$reg->print();
close $f1;

It is giving me the error:

regression_ema.pl::Statistics::Regression:standarderrors: I cannot compute the theta-covariance matrix for variable 1 0
 at /usr/local/share/perl/5.20.2/Statistics/Regression.pm line 619, <$f1> line 2472.
    Statistics::Regression::standarderrors(Statistics::Regression=HASH(0x23f41f0)) called at /usr/local/share/perl/5.20.2/Statistics/Regression.pm line 430
    Statistics::Regression::print(Statistics::Regression=HASH(0x23f41f0)) called at regression_ema.pl line 23
1

There are 1 answers

0
Sinan Ünür On

The documentation says:

Please note that you must provide the constant if you want one.

which would lead me to believe that one can simply omit the constant term if it is not wanted. But, in that case, the module croaks:

t.pl::Statistics::Regression:new: Cannot run a regression without at least two variables.

I have a long-held belief that, for numerical analysis and statistical computations, one ought not to trust well-intentioned but ill-conceived amateurish contributions.

Use something that is intended to be used by Statisticians. R is one obvious alternative.