how to download and import (preferably using spacy and from huggin face) the latest trained official version of biobert to perform ner on medical text

87 views Asked by At

Zhang et al. research in 2020 compared biobert and scispacy ner models accuracy, overall biobert won. How to download and import (preferably using spacy and from huggin face) the latest **trained ** official version of biobert to perform ner on **uncased ** medical text. If there is a better performing medical text ner model, please inform. The goal is to identify diagnosis, operations and *optionally * drug mentions.

looked at lots of hugging face code but does not support pre-trained model usage

1

There are 1 answers

0
norm On

answering my own question, I created a conda environment and installed a few packages..

conda create --name biobertner python=3.11
conda activate biobertner
pip3 install torch torchvision torchaudio --index-url 
https://download.pytorch.org/whl/cu121
pip3 install transformers

i used a biobert model from https://huggingface.co/alvaroalon2/biobert_diseases_ner. For the python i did the following..

    from transformers import AutoTokenizer, AutoModelForTokenClassification
    from transformers import pipeline
    tokenizer = AutoTokenizer.from_pretrained("alvaroalon2/biobert_diseases_ner")
    model = AutoModelForTokenClassification.from_pretrained("alvaroalon2/biobert_diseases_ner")
    nlp = pipeline("ner", model=model, tokenizer=tokenizer)
    example = "she had a cold on the day she was diagnosed with cancer in her left lung on June 2023"
    ner_results = nlp(example)
    for ent in ner_results:
        print(ent)