Text representation for Neural training Network

Question

Text representation for Neural training Network

55 views Asked by Eadhun Di At 03 May 2016 at 20:19

I'm developing a Neural training network with nntool in Matlab and I have as inputs 11250 text files with different lengths (from 10 to 500 words or let's say from 10 to 200 words if I eliminate redundant words ), I didn't find a good method to represent this input texts as a digital data to run my training algorithm. I thought about creating a vocabulary of words, but I've found that the vocabulary contains 16000 different words which is huge. There are some words in common between some text files.

Original Q&A

There are 1 answers

**404pio** · Answer 1 · 2016-05-04T07:34:58+00:00

For quick sollution you should look for "bag of words" or "tfidf". If you don't know what is this, you should start here: https://en.wikipedia.org/wiki/Vector_space_model or https://en.wikipedia.org/wiki/Document_classification .

Have you read any book about NLP? Maybe this one may be valuable: http://www.nltk.org/book/ at the very begin.

TechQA.

Text representation for Neural training Network

There are 1 answers

Related Questions in MATLAB

Related Questions in NEURAL-NETWORK

Related Questions in NNTOOL

Popular Questions

Popular Tags

Trending Questions