Named entity recognition with a small data set (corpus)

Question

Named entity recognition with a small data set (corpus)

936 views Asked by mhbashari At 14 June 2015 at 11:12

I want to develop a Named entity recognition system in Persian language but we have a small NER tagged corpus for training ans test. Maybe In the future we'll have a better and bigger corpus. By the way I need a solution that get incrementally the better performance whenever the new data added without with merge the new data with old data and training from scratch. Is there any solution ?

Original Q&A

There are 1 answers

**sebilasse** · Answer 1 · 2015-08-07T13:30:57+00:00

Yes. With your help: it is a work in progress. It is JS and "No training ..."

Please see https://github.com/redaktor/nlp_compromise/ !

It is a fork where I worked on NER during the last days and it will be optimized for usage with different languages !!!

It is a combination of a dictionary for words, dictionary for rules + build tool. It would be awesome to work on persian support (I am working on german) ... It is planned to support NER of

'CARDINAL' -> [ready]
'DATE' -> calendar based [gregorian calendar is ready]
'DURATION' -> see above [date ranges are ready]
'MEASURE' -> systems based [metric system and SI units ready, 80+ categories]
'MONEY' -> currencies based [ready in a few days]
'PERSON' -> word/rules based [english/european names are ready]
'ORGANIZATION'
'LOCATION'

I think it could be a starting point ? I did not find the time to document the new features - feel free to open issues on github.

TechQA.

Named entity recognition with a small data set (corpus)

There are 1 answers

Related Questions in CONTINUOUS-DEPLOYMENT

Related Questions in NAMED-ENTITY-RECOGNITION

Related Questions in REINFORCEMENT-LEARNING

Related Questions in NAMED-ENTITY-EXTRACTION

Popular Questions

Popular Tags

Trending Questions