I'd like to get the gerund form of a string. I have not found a straightforward way to invoke a library to get the gerund.
I applied the rules for words ending in 'ing`, but because I am getting some errors due to exceptions. Then, I am checking against the cmu words to ensure the generated gerund word is correct. The code looks as follows:
import cmudict
import re
ing= 'ing'
vowels = "aeiou"
consonants = "bcdfghjklmnpqrstvwxyz"
words=['lead','take','hit','begin','stop','refer','visit']
cmu_words= cmudict.words()
g_w = []
for word in words:
if word[-1] == 'e':
if word[:-1] + ing in cmu_words:
g_w.append(word[:-1] + ing)
elif count_syllables(word) == 1 and word[-2] in vowels and word[-1] in consonants:
if word.__len__()>2 and word[-3] in vowels:
if word + ing in cmu_words:
g_w.append(word + ing)
else:
if word + word[-1] + ing in cmu_words:
g_w.append(word + word[-1] + ing)
elif count_syllables(word)>1 and word[-2] in vowels and word[-1] in consonants:
if word + word[-1]+ ing in cmu_words:
g_w.append(word + word[-1]+ ing)
else:
if word + ing in cmu_words:
g_w.append(word + ing)
print(g_w)
The rules are as follow:
when a verb ends in "e", drop the "e" and add "-ing". For example: "take + ing = taking".
when a one-syllable verb ends in vowel + consonant, double the final consonant and add "-ing". For example: "hit + ing = hitting".
When a verb ends in vowel + consonant with stress on the final syllable, double the consonant and add "-ing". For example: "begin + ing = beginning".
Do not double the consonant of words with more than one syllable if the stress is not on the final
Is there a more efficient way to get the gerunds of a string if exists?
Thanks
Maybe this is what you are looking for. Library called
pyinflect
There is a variety of tags available for getting inflections including the 'VBG' tag (Verb, Gerund) you are looking for.
Here is a sample implementation.
NOTE: The authors have setup a more sophisticated and benchmarked library which does both lemmatization and inflections called
LemmInflect
. Do check this out if you want something more reliable than the above library. The syntax is pretty much the same as above.