I'm doing a text mining using "tm" packages in R, and I can get word frequencies after I generate a term document matrix:
freq <- colSums(as.matrix(dtm))
ord <- order(freq)
freq[head(ord)]
# abit acal access accord across acsess
# 1 1 1 1 1 1
freq[tail(ord)]
# direct save month will thank list
# 106 107 116 122 132 154
It can only present me a list of word frequencies by sequence, I was wondering if I can check a word's frequency individually? Can I also check a phrase's frequency? For example, how many times the word "thank" is in a text corpus or what is the frequency of the phrase "contact number" shown in this corpus?
Many thanks for any hints and suggestions.
I show this by the data from the tm package:
If you want to do this with phrases your dtm must contain these phrases not just the bag of single words as it is used in most cases. if this data is available, the procedure is the same as for a single word.