module 'tensorflow_datasets.core.features' has no attribute 'text'

5.2k views Asked by At

Good day everyone, I am developing a Sentiment Analysis using Tensorflow, using some reviews based on electronics on Amazon. In the code, I encountered an error. I used tensorflow datasets to retrieve some texts, but however was unable to. Here is part of the code, containing the error below:

tokenizer = tfds.features.text.Tokenizer()

vocabulary_set = set()
for _, reviews in train_dataset.enumerate():
review_text = reviews['data']
reviews_tokens = tokenizer.tokenize(review_text.get('review_body').numpy())
vocabulary_set.update(reviews_tokens)
vocab_size = len(vocabulary_set)
vocab_size

The error I got from here is an attribute error

AttributeError                            Traceback (most recent call last)
<ipython-input-17-1c32dce13853> in <module>()
----> 1 tokenizer = tfds.features.text.Tokenizer()
AttributeError: module 'tensorflow_datasets.core.features' has no attribute 'text'

Please how can I resolve this error? Thank you

1

There are 1 answers

1
Nicolas Gervais - Open to Work On BEST ANSWER

It's deprecated but you can still access it like this:

import tensorflow_datasets as tfds

tokenizer = tfds.deprecated.text.Tokenizer()

tokenizer.tokenize('hey how are you?')
['hey', 'how', 'are', 'you']