Wordnet ws4j confounding lesk value, iterating over all the synsets

408 views Asked by At

This may be totally normal, but I have ws4j for Java, and I seem to get numbers like "1.7345..." for the lesk measure between two words (when I use the demo code), but on the demo website http://ws4jdemo.appspot.com/?mode=w&s1=&w1=solve&s2=&w2=determine

the lesk measure is a whole number, like "57". I can't seem to find a reason for this, but I'm also new to programming in general.

I wanted to write something that takes in a word1 and iterates over the rest of the words, returning only the words whose Lesk measures (when compared to word1) are above a certain value. Which brings me to a related question, in Python, I can iterate over all the synsets with

for x in wn.all_synsets():

But I don't know how to do the same with ws4j?

1

There are 1 answers

0
user3503711 On

Why you need to iterate over all synsets while you only need the Lesk value? Try this -

private static ILexicalDatabase db = new NictWordNet();
private static RelatednessCalculator[] rcs = { new Lesk(db) };

private static double run(String word1, String word2) {
    WS4JConfiguration.getInstance().setMFS(true);
    double s = 0;
    for (RelatednessCalculator rc : rcs) {
        s = rc.calcRelatednessOfWords(word1, word2);
    }
    if (s > your_value)
    return s;
}