What does the numbers appended to words in google ngrams data mean?

166 views Asked by At

I understand from the info/download page, that the format for google ngrams data is

ngram TAB year TAB match_count TAB volume_count NEWLINE

Here's a small extract from the file that has 1 grams that starts with a:

announced.37_VERB 2008 1 1
annually.34 1913 2 2

I understand that the _VERB part is POS tagging. However I couldn't find reliable documentation as to what the numbers after period means i.e .37 or .34 etc., If someone could provide some lead on this it would be of great help for all those getting to started to work on NLP using google ngrams as data source.

0

There are 0 answers