How to use HeidelTime temporal tagger inside a Java project?

1.9k views Asked by At

I would like to automatically identify dates inside a stream of documents and in this sense I would like to use the code provided by the open source project Heideltime, available here (https://code.google.com/p/heideltime/). I have installed the Heideltime kit (not the standalone version) and now I am wondering how can I reference it and call it inside my Java project. I have already added a dependecy to Heideltime inside my pom.xml:

    <dependency>
        <groupId>de.unihd.dbs</groupId>
        <artifactId>heideltime</artifactId>
        <version>1.7</version>
    </dependency>

however I am not sure how to call the classes from this source project into my own project. I am using Maven for both. Anyone who has used it before could maybe give me a suggestion or piece of advice? Many thanks!

3

There are 3 answers

0
Shenal On

This library is not in the maven central repository yet. (You can check this in this search.maven.org site.)

To use the library in your project. You should download the JAR file and install it locally. Refer this questions answer: How to add local jar files in maven project? .

Then you can just use the import package and use the functionality in your project.

1
user3776894 On

Adding to the reply from jgloves, you might be interested to parse the Heideltime result string into a Java object representation. The following code transforms the Uima-XML representation into Timex3 objects.

    HeidelTimeStandalone time = new HeidelTimeStandalone(Language.GERMAN, DocumentType.SCIENTIFIC, OutputType.XMI, "config.props", POSTagger.STANFORDPOSTAGGER);
    String xmiRepresentation = time.process(document, documentCreationTime); //Apply Heideltime and get the XML-UIMA representation     
    JCas cas = jcasFactory.createJCas();

    for(FSIterator<Annotation> it= cas.getAnnotationIndex(Timex3.type).iterator(); it.hasNext(); ){
            System.out.printkn(it.next);
    }
0
jgloves On

heideltime-kit is itself a Maven project. So, you can add the heideltime-kit project as a dependency. (In Netbeans, right click on Dependencies, --> Add Dependency --> Open Projects (make sure the project is open first) --> HeidelTime)

Then move the config.props file into your project's src/main/resources folder. Set the path to treetagger within config.props.

As far as using the classes goes, you'll want to create an instance of HeidelTimeStandalone (see de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.java) using POSTagger.TREETAGGER as the posTagger parameter and a hardcoded path to your src/main/resources/config.props file as the configPath parameter. For example,

heidelTime = new HeidelTimeStandalone(Language.ENGLISH,
                                      DocumentType.COLLOQUIAL,
                                      OutputType.TIMEML,
                                      "path/to/config.props",
                                      POSTagger.TREETAGGER, true);

Then to use HeidelTime to process text, you can simply call the process function:

String result = heidelTime.process(text, date);