I have trained a spancat model in spacy, it was trained successfully. Now when I run it on test data, it doesn't make any predictions.
Here are the training results:
This is how I am doing the predictions:
for text in df['text_cleaned']:
doc = nlp(text)
spans = doc.spans
When I look at the spans they are all empty: https://i.stack.imgur.com/b0tGu.png
Here is the cfg file I'm using:
train = null
dev = null
vectors = null
init_tok2vec = null
gpu_allocator = null
seed = 444
lang = "en"
pipeline = ["tok2vec","spancat"]
batch_size = 1000
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
factory = "spancat"
max_positive = null
scorer = {"@scorers":"spacy.spancat_scorer.v1"}
spans_key = "sc"
threshold = 0.5
@architectures = "spacy.SpanCategorizer.v1"
@layers = "spacy.mean_max_reducer.v1"
hidden_size = 128
@layers = "spacy.LinearLogistic.v1"
nO = null
nI = null
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"
@misc = "spacy.ngram_suggester.v1"
sizes = [1,2,3]
factory = "tok2vec"
@architectures = "spacy.Tok2Vec.v2"
@architectures = "spacy.MultiHashEmbed.v2"
width = ${components.tok2vec.model.encode.width}
attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
rows = [5000,1000,2500,2500]
include_static_vectors = true
@architectures = "spacy.MaxoutWindowEncoder.v2"
width = 256
depth = 8
window_size = 1
maxout_pieces = 3
@readers = "spacy.read_labels.v1"
path = null
require = true
@readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null
@readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
max_epochs = 70
seed = ${system.seed}
gpu_allocator = ${system.gpu_allocator}
dropout = 0.1
accumulate_gradient = 1
patience = 1600
max_steps = 20000
eval_frequency = 200
frozen_components = []
annotating_components = []
before_to_disk = null
@batchers = "spacy.batch_by_words.v1"
discard_oversize = false
tolerance = 0.2
get_length = null
@schedules = "compounding.v1"
start = 100
stop = 1000
compound = 1.001
t = 0.0
@loggers = "spacy.ConsoleLogger.v1"
progress_bar = false
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = false
eps = 0.00000001
learn_rate = 0.001
spans_sc_f = 1.0
spans_sc_p = 0.0
spans_sc_r = 0.0
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null
Originally, for my NER model, I had labeled examples in this format: ('Apple is a large company', {'entities': [(0, 4, 'ORG')]})
I fed a list of examples in this format into this function to convert them to spancat format:
def convert_to_docbin(input, output_path="./train.spacy", lang='en'):
""" Convert a pair of text annotations into DocBin then save """
# Load a new spacy model:
nlp = spacy.blank(lang)
# Create a DocBin object:
db = DocBin()
for text, annotations in input: # Data in previous format
doc = nlp(text)
ents = []
spans = []
for start, end, label in annotations['entities']: # Add character indexes
spans.append(Span(doc, 0, len(doc), label=label))
span = doc.char_span(start, end, label=label)
doc.ents = ents # Label the text with the ents
group = SpanGroup(doc, name="sc", spans=spans)
doc.spans["sc"] = group
convert_to_docbin(examples, output_path="/train.spacy", lang='en')
My NER model using the same training examples made a lot of predictions, so I'm wondering what's going on here that spancat doesn't seem to be working? Is my training data in the wrong format? Is my config off? Something else?