Consider the usual example that replicates example from 13.1 of An Introduction to Information Retrieval
https://nlp.stanford.edu/IR-book/pdf/irbookonlinereading.pdf
txt <- c(d1 = "Chinese Beijing Chinese",
d2 = "Chinese Chinese Shanghai",
d3 = "Chinese Macao",
d4 = "Tokyo Japan Chinese",
d5 = "Chinese Chinese Chinese Tokyo Japan")
trainingset <- dfm(txt, tolower = FALSE)
trainingclass <- factor(c("Y", "Y", "Y", "N", NA), ordered = TRUE)
tmod1 <- textmodel_nb(trainingset, y = trainingclass, prior = "docfreq")
According to the docs, PcGw
is the posterior class probability given the word
. How it is computed? I thought what we cared about was the other way around, that is P(word / class)
.
> tmod1$PcGw
features
classes Chinese Beijing Shanghai Macao Tokyo Japan
N 0.1473684 0.2058824 0.2058824 0.2058824 0.5090909 0.5090909
Y 0.8526316 0.7941176 0.7941176 0.7941176 0.4909091 0.4909091
Thanks!