sentence = 'American Airlines was the first airline to fly every A380 flight perfectly when President George Bush was in Office. The Woodlands Texas is a great place to be.'
ner = pipeline('text-classification', model='dbmdz/bert-large-cased-finetuned-conll03-english', grouped_entities=True)
ners = ner(sentence)
print('\nSentence:')
print(wrapper.fill(sentence))
print('\n')
for n in ners:
  print(f"{n['word']} -> {n['entity_group']}")

I am inside google colab. I tried !pip install transformers --upgrade # The error is caused by a bug in the transformers library. The fix is to install the latest version of the library. but I received the following:

/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py in _encode_plus(self, text, text_pair, add_special_tokens, padding_strategy, truncation_strategy, max_length, stride, is_split_into_words, pad_to_multiple_of, return_tensors, return_token_type_ids, return_attention_mask, return_overflowing_tokens, return_special_tokens_mask, return_offsets_mapping, return_length, verbose, **kwargs)
    574     ) -> BatchEncoding:
    575         batched_input = [(text, text_pair)] if text_pair else [text]
--> 576         batched_output = self._batch_encode_plus(
    577             batched_input,
    578             is_split_into_words=is_split_into_words,

TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'grouped_entities'
1

There are 1 answers

2
حمزة نبيل On BEST ANSWER

There may be a confusion , the Named Entity Recognition task is a token-classification task, not a text-classification task. Please update your code:

ner = pipeline(
    'token-classification',
    model='dbmdz/bert-large-cased-finetuned-conll03-english',
    grouped_entities=True
)  # alias "ner" available

That will raise a warning :

UserWarning: `grouped_entities` is deprecated and will be removed in version v5.0.0, defaulted to `aggregation_strategy="simple"` instead.

Updated code with aggregation_strategy:

# Updated code with 'aggregation_strategy'
ner = pipeline(
    'ner',
    model='dbmdz/bert-large-cased-finetuned-conll03-english',
    aggregation_strategy='simple'
)