R wordcloud2 not putting the most frequent words on edges

68 views Asked by At

I would like to use the wordcloud2 function in R on my dataset. The demo works nice: demo image from the library docs

But my dataset centers the small text in the middle. Any ideas/suggestions (includning other R libraries) welcomed:)

my dataset

Thank you very much Vladimir Vinarsky The curious Mechanobiologist

I expected my higherst frequency words in the middle, not in the edges. Tried to

  1. manipulate the frequencies distribution by squaring it or making a power of 3
  2. tried more shapes
1

There are 1 answers

1
Allan Cameron On BEST ANSWER

Words at the top of your data frame are plotted centrally; those at the bottom are plotted peripherally, so if you want the big words in the middle, sort your data frame accordingly.

For example, let's generate a data frame of computing terms with a random frequency column:

my_data <- data.frame(word = c("Algorithm", "Function", "Variable",
  "Loop", "Object", "Class", "Inheritance", "Interface", "Array", 
  "String", "Integer", "Boolean", "Compiler", "Debugger", "Syntax",
  "Exception", "Library", "Framework", "API", "Database", "Query", 
  "Server", "Client", "Protocol", "Encryption", "Binary", "Source",
  "IDE", "Repository", "Recursion", "Data", "Pointer", "Stack", "Queue", 
  "Tree", "Graph", "Hash", "Encryption", "Bit", "Byte", "Bandwidth", 
  "Cache", "Cloud", "Compiler", "Constant", "Debug", "Deployment",
  "DNS", "Domain", "Email", "Firewall", "Gateway", "Git", "Hardware",
  "HTTP", "HTTPS", "IP Address", "JSON", "Kernel", "LAN", "Metadata",
  "Multithreading", "Network", "Node", "Packet", "Patch", "Pixel",
  "Platform", "Plugin"))

set.seed(1)
my_data$freq <- round(rnorm(nrow(my_data), 10, 3)^2)

If I plot without ordering at all, I get a fairly random distribution of sizes throughout the cloud:

wordcloud2::wordcloud2(my_data, size = 0.3)

enter image description here

If I sort from small to large, I can replicate your issue with the smaller words appearing in the middle of the image:

wordcloud2::wordcloud2(my_data[order(my_data$freq),], size = 0.3)

enter image description here

If I sort in reverse order, I get the desired output, with the larger words drawn in the middle before smaller words are added around and between the larger words:

wordcloud2::wordcloud2(my_data[rev(order(my_data$freq)),], size = 0.3)

enter image description here