how to find lexicographer id into WorNet's nt file without library

135 views Asked by At

I'm trying to link VerbNet with WordNet using the files they provide to work directly with data:

VerbNet => http://verbs.colorado.edu/verb-index/vn/verbnet-3.3.tar.gz

WordNet => http://wordnet-rdf.princeton.edu/static/wordnet.nt.gz

The verbs in VerbNet have a link to WordNet through their sense_key:

e.g. live%2:31:00::

This would be the structure of sense_key:

(lemma)%(part_of_speech_number):(lexical_file_number):(lexicographer_id)::

Parsing the n-triples of the nt file, I have found all the data except the lexicographer_id:

lemma => live 
part_of_speech_number => 2 
lexical_file_number => 31
lexicographer_id => ??
1

There are 1 answers

0
Adrián Rivero On BEST ANSWER

Parsing the wordnet.nt file doesn't seem to give you this information.

If Wordnet 3.1 database is downloaded from http://wordnetcode.princeton.edu/wn3.1.dict.tar.gz (link in https://wordnet.princeton.edu/download/current-version), there you'll find the file "index.sense" which contains entries like these:

bethel%1:06:00:: 02836245 1 0
bethink%2:31:00:: 00685046 2 1
bethink%2:39:00:: 02171205 1 3
bethlehem%1:15:00:: 08813084 2 0

The current description of this structure is on https://wordnet.princeton.edu/documentation/senseidx5wn

The first parameter in the line is the sense_key which is used in VerbNet. The second parameter is the synset_offset which coincides with the Synset Identifier in the file wordnet.nt.

From the file "index.sense" you can get also the sense number to match against the structure "word.pos.sense_number", like in: "man.n.02"