With CRF++, MIRA works for me but CRF-L1 and CRF-L2 do not

1.3k views Asked by At

It may not matter, but I am using the windows distribution of CRF++ 0.58.

So I have successfully used mallet to train a model with a CRF and then test it. When I try to use the same train and test files with CRF++ (and after creating a template file), I get a

The line search routine mcsrch failed: error code:0

error when I use either

-a CRF-L1

or the default

-a CRF-L2

When I use

-a MIRA

though, training works without error and same with test.

The format of the test and training data can be the same for both mallet and crf++, so that is not the issue. My template file is as simple as

#Mixed
M00:%x[0,0]
M01:%x[0,1]
M02:%x[0,2]
......
M12:%x[0,12]

My last column is either 0 or 1 in my training data which is the value to classify with. No whitespace in any of my features, I use underscores when necessary. Am I missing something simple here, what would cause the L1 and L2 regularizations to fail like that?

2

There are 2 answers

1
demongolem On BEST ANSWER

I knew it was something silly ...

To use features like I am using, you need to use the U prefix (as in Unigram). So like U00:%x[0,0] is fine. You can't just call you features anything you want.

I also discovered that if I stripped down my test data to a single sentence, I would get the same error message. When I restored my test data back to its original size of around 2600 sentences, the regularization algorithms now run. Overfitting is a common cause of this error message across various nlp and ml applications, but that was not the true problem in my case.

0
user_1177868 On

It can also happen in the extreme case of a dataset with just one CLASS (due to bug in the training set generation procedure).