NLP Pipeline, DKPro, Ruta - Missing Descriptor Error

59 views Asked by At

I am trying to run a RUTA script with an analysis pipeline.

I add my script to the pipeline like so createEngineDescription(RutaEngine.class, RutaEngine.PARAM_MAIN_SCRIPT, "mypath/myScript.ruta)

My ruta script file contains this:

IMPORT PACKAGE de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos
    FROM desc.type.POS AS pos;
IMPORT de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Lemma
    FROM desc.type.LexicalUnits;
IMPORT de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token
    FROM desc.type.LexicalUnits_customized AS token;
IMPORT de.tudarmstadt.ukp.dkpro.core.api.syntax.type.dependency.Dependency
    FROM desc.type.Dependency AS dep;
IMPORT de.tudarmstadt.ukp.dkpro.core.type.ReadabilityScore
    FROM desc.type.ReadabilityScore;
IMPORT de.tudarmstadt.ukp.dkpro.core.api.metadata.type.TagsetDescription
    FROM desc.type.metadata;

UIMAFIT de.tudarmstadt.ukp.dkpro.core.opennlp.OpenNlpSegmenter;
UIMAFIT de.tudarmstadt.ukp.dkpro.core.opennlp.OpenNlpPosTagger;
UIMAFIT de.tudarmstadt.ukp.dkpro.core.corenlp.CoreNlpLemmatizer;
UIMAFIT de.tudarmstadt.ukp.dkpro.core.maltparser.MaltParser;
UIMAFIT de.tudarmstadt.ukp.dkpro.core.readability.ReadabilityAnnotator;

uima.tcas.DocumentAnnotation{-CONTAINS(pos.POS)} -> {
    uima.tcas.DocumentAnnotation{-> SETFEATURE("language", "en")};
EXEC(OpenNlpSegmenter);
EXEC(OpenNlpPosTagger);  
EXEC(CoreNlpLemmatizer);
EXEC(MaltParser); 
EXEC(ReadabilityAnnotator);
};

This generates the error -> Annotation exception: Initialization of annotator class "org.apache.uima.ruta.engine.RutaEngine" failed. (Descriptor: unknown)

Do I need a descriptor? This answer How to create pipeline of java nlp and ruta scripts? suggests to me that its not required but perhaps I am misunderstanding what is required. If it is needed then how do I add it?

I am using uimafit-core:2.5.+ and org.apache.uima:ruta-core:2.8.1

Scanning for other solutions I also tried this

AnalysisEngine aae = createEngine(RutaEngine.class,
            RutaEngine.PARAM_MAIN_SCRIPT, "myscript.ruta",
            RutaEngine.PARAM_SCRIPT_PATHS, new String[] { "src/main/resources/ruta" },
            RutaEngine.PARAM_ADDITIONAL_EXTENSIONS, new String[] {
                    BooleanOperationsExtension.class.getName(),
                    StringOperationsExtension.class.getName()});

but with no improvement. I get the same error.

1

There are 1 answers

0
RodP On BEST ANSWER

I solved the problem. This error was being thrown simply because the script could not be found and I had to change this line from: RutaEngine.PARAM_MAIN_SCRIPT, "myscript.ruta" to: RutaEngine.PARAM_MAIN_SCRIPT, "myscript"

However, I did a few other things before this that may have contributed to the solution so I am listing them here:

  1. I added the ruta nature to my eclipse project
  2. I moved the myscript from resources to a script package