I have been trying to train a deeppavlov model for NER based on the train syntax given on their docs and it keeps failing with below error message:
/opt/anaconda3/envs/py36/lib/python3.6/site-packages/deeppavlov/dataset_readers/conll2003_reader.py in parse_ner_file(self, file_name)
104 items = line.split()
105 if len(items) < expected_items:
--> 106 raise Exception(f"Input is not valid {line}")
107 tokens.append(items[0])
108 tags.append(items[-1])
Exception: Input is not valid aio-pika==6.4.1
Used the following code to train the deeppavlov model, it seems to be working on their sample dataset, but when I created my own dataset as per their training sample guide, I keep getting above error message. Training ner code:
from deeppavlov import configs, train_model, build_model
from deeppavlov.core.commands.utils import parse_config
import json
with configs.ner.ner_ontonotes_bert_mult.open(encoding='utf8') as f:
ner_config = json.load(f)
ner_config['dataset_reader']['data_path'] = '/Users/smankari001/deeppavlov' # directory with train.txt, valid.txt and test.txt files
ner_config['metadata']['variables']['NER_PATH'] = '/Users/smankari001/deeppavlov'
ner_config['metadata']['download'] = [ner_config['metadata']['download'][-1]] # do not download the pretrained ontonotes model
ner_model = train_model(ner_config, download=True)
input train.txt file:
What O
kind O
of O
memory O
? O
We O
respectfully O
invite O
you O
to O
watch O
a O
special O
edition O
of O
Across B-ORG
China I-ORG
. O
WW B-WORK_OF_ART
II I-WORK_OF_ART
Landmarks I-WORK_OF_ART
on I-WORK_OF_ART
the I-WORK_OF_ART
Great I-WORK_OF_ART
Earth I-WORK_OF_ART
of I-WORK_OF_ART
China I-WORK_OF_ART
: I-WORK_OF_ART
Eternal I-WORK_OF_ART
Memories I-WORK_OF_ART
of I-WORK_OF_ART
Taihang I-WORK_OF_ART
Mountain I-WORK_OF_ART
Standing O
tall O
on O
Taihang B-LOC
Mountain I-LOC
is O
the B-WORK_OF_ART
Monument I-WORK_OF_ART
to I-WORK_OF_ART
the I-WORK_OF_ART
Hundred I-WORK_OF_ART
Regiments I-WORK_OF_ART
Offensive I-WORK_OF_ART
. O
It O
is O
composed O
of O
a O
primary O
stele O
, O
secondary O
steles O
, O
a O
huge O
round O
sculpture O
and O
beacon O
tower O
, O
and O
the B-WORK_OF_ART
Great I-WORK_OF_ART
Wall I-WORK_OF_ART
, O
among O
other O
things O
. O
A O
primary O
stele O
, O
three B-CARDINAL
secondary O
steles O
, O
and O
two B-CARDINAL
inscribed O
steles O
. O
The B-EVENT
Hundred I-EVENT
Regiments I-EVENT
Offensive I-EVENT
was O
the O
campaign O
of O
the O
largest O
scale O
launched O
by O
the B-ORG
Eighth I-ORG
Route I-ORG
Army I-ORG
during O
the B-EVENT
War I-EVENT
of I-EVENT
Resistance I-EVENT
against I-EVENT
Japan I-EVENT
. O
This O
campaign O
broke O
through O
the O
Japanese B-NORP
army O
's O
blockade O
to O
reach O
base O
areas O
behind O
enemy O
lines O
, O
stirring O
up O
anti-Japanese B-NORP
spirit O
throughout O
the O
nation O
and O
influencing O
the O
situation O
of O
the O
anti-fascist O
war O
of O
the O
people O
worldwide O
. O
As
ner_config['dataset_reader']['data_path']
you need to specify path to folder with only dataset files (train/valid/test).This error:
says that DatasetReader started to read lines from
requirements.txt
file.