gerund form of a word in python

1.3k views Asked by At

I'd like to get the gerund form of a string. I have not found a straightforward way to invoke a library to get the gerund.

I applied the rules for words ending in 'ing`, but because I am getting some errors due to exceptions. Then, I am checking against the cmu words to ensure the generated gerund word is correct. The code looks as follows:

import cmudict
import re

ing= 'ing'
vowels = "aeiou"
consonants = "bcdfghjklmnpqrstvwxyz"
words=['lead','take','hit','begin','stop','refer','visit']
cmu_words= cmudict.words()
g_w = []

for word in words:
    if word[-1] == 'e':
        if word[:-1] + ing in cmu_words:
            g_w.append(word[:-1] + ing)             
    elif count_syllables(word) == 1 and word[-2] in vowels and word[-1] in consonants:
        if word.__len__()>2 and word[-3] in vowels:
            if word + ing in cmu_words:
                g_w.append(word + ing)                 
        else:
            if word + word[-1] + ing in cmu_words:
                g_w.append(word + word[-1] + ing)
    elif count_syllables(word)>1 and word[-2] in vowels and word[-1] in consonants:
        if word + word[-1]+ ing in cmu_words:
            g_w.append(word + word[-1]+ ing)            
        else:
            if word + ing in cmu_words:
                g_w.append(word + ing) 
    
print(g_w)

The rules are as follow:

when a verb ends in "e", drop the "e" and add "-ing". For example: "take + ing = taking".
when a one-syllable verb ends in vowel + consonant, double the final consonant and add "-ing". For example: "hit + ing = hitting".
When a verb ends in vowel + consonant with stress on the final syllable, double the consonant and add "-ing". For example: "begin + ing = beginning".
Do not double the consonant of words with more than one syllable if the stress is not on the final

Is there a more efficient way to get the gerunds of a string if exists?

Thanks

1

There are 1 answers

0
Akshay Sehgal On BEST ANSWER

Maybe this is what you are looking for. Library called pyinflect

A python module for word inflections that works as a spaCy extension. To use standalone, import the method getAllInflections and/or getInflection and call them directly. The method getInflection takes a lemma and a Penn Treebank tag and returns a tuple of the specific inflection(s) associated with it.

There is a variety of tags available for getting inflections including the 'VBG' tag (Verb, Gerund) you are looking for.

pos_type = 'A'
* JJ      Adjective
* JJR     Adjective, comparative
* JJS     Adjective, superlative
* RB      Adverb
* RBR     Adverb, comparative
* RBS     Adverb, superlative

pos_type = 'N'
* NN      Noun, singular or mass
* NNS     Noun, plural

pos_type = 'V'
* VB      Verb, base form
* VBD     Verb, past tense
* VBG     Verb, gerund or present participle
* VBN     Verb, past participle
* VBP     Verb, non-3rd person singular present
* VBZ     Verb, 3rd person singular present
* MD      Modal

Here is a sample implementation.

#!pip install pyinflect
from pyinflect import getInflection

words = ['lead','take','hit','begin','stop','refer','visit']
[getInflection(i, 'VBG') for i in words]
[('leading',),
 ('taking',),
 ('hitting',),
 ('beginning',),
 ('stopping', 'stoping'),
 ('referring',),
 ('visiting',)]

NOTE: The authors have setup a more sophisticated and benchmarked library which does both lemmatization and inflections called LemmInflect. Do check this out if you want something more reliable than the above library. The syntax is pretty much the same as above.