I would like to use sentence_transformers in AML to run XLM-Roberta model for sentence embedding. I have a script in which I import sentence_transformers:
from sentence_transformers import SentenceTransformer
Once I run my AML pipeline, the run fails on this script with the following error:
AzureMLCompute job failed.
UserProcessKilledBySystemSignal: Job failed since the user script received system termination signal usually due to out-of-memory or segfault.
Cause: segmentation fault
NodeIp: #####
NodeId: #####
I'm pretty sure that this import is causing this error, because if I comment out this import, the rest of the script will run. This is weird because the installation of the sentence_transformers succeed.
This is the details of my compute:
Virtual machine size
STANDARD_NV24 (24 Cores, 224 GB RAM, 1440 GB Disk)
Processing Unit
GPU - 4 x NVIDIA Tesla M60
Agent Pool:
Azure Pipelines
Agent Specification:
requirements.txt file:
Does anyone have a solution for this error?
I fixed the issue by changing the pytorch version from 1.4.0 to 1.6.0. So the requirements.txt looks like this:
At first I tried one of the older versions of sentence-transformers which was compatible with pytorch 1.4.0. But the older version doesn't support "xml-roberta-base" model, so I tried to upgrade the pytorch version.