I'm trying to utilize the WordCloud package in python and am getting errors when trying to utilize the include_numbers parameter. I've copied the github link for the package, the specific parameter definition (I've tried both the correct spelling and noted incorrect spelling) and I get the below error

https://amueller.github.io/word_cloud/generated/wordcloud.WordCloud.html

incldue_numbers:bool, default=False Whether to include numbers as phrases or not.

TypeError: init() got an unexpected keyword argument 'include_numbers'

Section I'm attempting to run:

import numpy as np # linear algebra
import pandas as pd 
import matplotlib as mpl
import matplotlib.pyplot as plt
##%matplotlib inline

from subprocess import check_output
from wordcloud import WordCloud, STOPWORDS

#mpl.rcParams['figure.figsize']=(8.0,6.0)    #(6.0,4.0)
mpl.rcParams['font.size']=12                #10 
mpl.rcParams['savefig.dpi']=100             #72 
mpl.rcParams['figure.subplot.bottom']=.1 


stopwords = set(STOPWORDS)
data = pd.read_csv("C:\\Users\\chris\\Documents\\testing\\wc_ad_copy_test.csv")

##test below
#data['dupe_copy'] = data['dupe_copy'].astype(str)
##end test



wordcloud = WordCloud(
                          background_color='white',
                          stopwords=stopwords,
                          max_words=200,
                          max_font_size=40, 
                          random_state=42,
                          include_numbers=True,
                          #collocations=True,
                          normalize_plurals=False
                         ).generate(str(data['scored_copy']))



print(wordcloud)
fig = plt.figure(1)
plt.imshow(wordcloud)
plt.axis('off')
plt.show()
fig.savefig("ad_copy_cloud_image.png", dpi=900)


wc = WordCloud(
                          background_color='white',
                          stopwords=stopwords,
                          max_words=200,
                          max_font_size=40, 
                          random_state=42,
                          include_numbers=True,
                          #collocations=True,
                          normalize_plurals=False
                         )

word_dict = wc.process_text(str(data['scored_copy']))

df = pd.DataFrame.from_dict(word_dict, orient='index')
df = df.reset_index()
df.columns = ['word', 'word_count']
df = df.sort_values(by='word_count', ascending=False)
df.to_csv("word_count_list.csv", index=False)

include_numbers throws the same error when ran as "False"

I expect this to run and output numbers in to the wordcloud

1 Answers

0
paul41 On

I looked through the wordcloud source code and the issue seems to be that the code on github and the pypi package for pip installing are not the same. The version you get when pip installing does not contain the include_numbers parameter.

I submitted this issue on github here: https://github.com/amueller/word_cloud/issues/482 if you want to follow and see what the developers say.