Accessing the vocabulary used by the vectorizer of the best estimator in GridSearch

213 views Asked by At

Didn't know to put it best in the title.

This is what I am trying to do: I am using GridSearch with a pipeline to train classifiers. I would like to see the vocabulary_.items() of the CountVectorizer used by the best estimator.

Right now, I am doing this, after running GridSearch:

classifier = gs_clf.best_estimator_
    vect = classifier.named_steps["vec"]
    data = vect.fit_transform(x_train)
    vocab = = vect.vocabulary_.items()

Is there any way to get the vocabulary items directly, without using fit_transform again on the CountVectorizer?

0

There are 0 answers