I'm doing a language recognizer, I had planned to classify my i-vectors with neural networks, but I've read a lot of papers and they always use other methods like SVM or PLDA, can someone explain to me why? or it's fine to do it with neural networks?

1 Answers

Nikolay Shmyrev On Best Solutions

Neural networks are good for complex non-linear multifeature input. I-vectors by design map speaker space to very simple space where speakers are easily separated with logistic regression or SVM.

If you want to try with neural networks, try something end-to-end like https://github.com/FlashTek/vggvox-pytorch