How to import a corpus from nltk in a variable to form ngarms in python?

61 views Asked by At

I want to form an ngram using nltk corpus reuters. I tested my code to form ngrams on a small corpus saved on my local disk as a text file using:

import nltk
file = open('dummytext.txt', encoding = 'utf8').read()

Now that my ngram probability code makes sense to me. I want to use the nltk corpus reuters which is a huge corpus so when i do the following:

import nltk
from nltk.corpus import reuters
file = reuters.words()

The processing to form unigrams goes on for eternity

How to unpack the nltk corpus as string in a variable to form ngrams using nltk?

0

There are 0 answers