Float is not iterable error on nltk.naivebayes classifier

204 views Asked by At

I am new to Python and am running the nltk.NaiveBayes classifier on training_set in the following line of code:

training_set = apply_features(extract_features, training_tweets)

training_set is a LazyMap and training_tweets is a list of tuples in the following format:

('ليت كل ايام السنة رمضان', 'positive') where the first part is an Arabic tweet and the second part is sentiment.

the function extract_features is below:

def extract_features(document):
    document_words = set(document)
    features = {}
    for word in word_features:
        features['contains(%s)' % word] = (word in document_words)
    return features   

This code works on English tweets. I am not sure where I should look for the float object in my list of tuples.

Any help is appreciated. It give the following traceback...

File "C:/Users/Owner/nb.py", line 64, in <module>
classifier = NaiveBayesClassifier.train(training_set)

File "C:\Users\Owner\Anaconda3\lib\site-packages\nltk\classify\naivebayes.py", line 194, in train
for featureset, label in labeled_featuresets:

File "C:\Users\Owner\Anaconda3\lib\site-packages\nltk\util.py", line 946, in iterate_from
try: yield self._func(self._lists[0][index])

File "C:\Users\Owner\Anaconda3\lib\site-packages\nltk\classify\util.py", line 65, in lazy_func
return (feature_func(labeled_token[0]), labeled_token[1])

File "C:/Users/Owner/nb.py", line 35, in extract_features
document_words = set(document)

TypeError: 'float' object is not iterable
1

There are 1 answers

0
Terry Jan Reedy On

The exception comes from the expression set(document), which assumes that document, the argument to extract_features, is an iterable.

That function is called in lazy_func with local name feature_func with argument labeled_token[0]. So the latter is a float rather than an iterable.

You will have to work back through the code of the functions in the traceback to see why. You may want to add some print statements here and there to expose intermediate values.