Can I manually reorder a LDA_Gibbs topicmodel

158 views Asked by At

I have a LDA_Gibbs topicmodel, from the topicmodels library. I also have a LDAvis interactive visualisation.

My issue is; the topics are not in the same order in the LDA object and in LDAvis.

I'd like to get one to map to the other (don't care which). My not working approach so far:

ldavis_data <- fromJSON(json_lda)
topic_order <- ldavis_data$topic.order
lda@gamma[order(topic_order), ]
lda@beta[, order(topic_order)]

inspired by this github issue - a different topic model package though

this completely clobbers my LDA object, however.

No reprex/MWE (yet; I could link a .rds file) - but output of glimpse(lda):

<snip>
..@ beta  :num [1:45, 1:333...]
..@ gamma :num [1:111..., 1:45]
</snip>

for now, I'm manually mapping the ldavis topics to LDA() object parallels.

---- EDIT ----

I've found a reasonable stopgap, almost: My further analysis relies on the tidy.LDA function from tidytext, so I can add the correct ordering of the topic-term mapping like so:


# terms to topics
tidy(lda, matrix = "beta") %>%
  # probably unnecessary, but make sure we're in topic order
  arrange(topic) %>%
  # turn topics into a factor, with levels according to new order
  mutate(topic = factor(topic, levels = topic_order) %>%
  # group by new factor order
  group_by(topic) %>%
  # make the current group id the current topic
  mutate(topic = cur_group_id()) %>%
  # dont forget! had me scratching my head for a few minutes
  ungroup

# documents to topics
tidy(lda, matrix = "gamma") %>%
  arrange(topic) %>%
  mutate(topic = factor(topic, levels = topic_order) %>%
  group_by(topic) %>%
  mutate(topic = cur_group_id()) %>%
  ungroup

Yup, works for document mapping too. Now to collapse them to a function ;)

2

There are 2 answers

0
Martin Andersen On

Reposting my edit as an answer, but am not yet inclined to accept it.

I get the results I want, sure; but not how I wanted.


I've found a reasonable stopgap, almost: My further analysis relies on the tidy.LDA() function from tidytext, so I can add the correct ordering of the topic-term mapping like so:


# terms to topics
tidy(lda, matrix = "beta") %>%
  # probably unnecessary, but make sure we're in topic order
  arrange(topic) %>%
  # turn topics into a factor, with levels according to new order
  mutate(topic = factor(topic, levels = topic_order) %>%
  # group by new factor order
  group_by(topic) %>%
  # make the current group id the current topic
  mutate(topic = cur_group_id()) %>%
  # dont forget! had me scratching my head for a few minutes
  ungroup

# documents to topics
tidy(lda, matrix = "gamma") %>%
  arrange(topic) %>%
  mutate(topic = factor(topic, levels = topic_order) %>%
  group_by(topic) %>%
  mutate(topic = cur_group_id()) %>%
  ungroup

Yup, works for document mapping too. Now to collapse them to a function ;)

0
ThomasK81 On

I think order() is the problem and also I think you tried to order columns when you need to order rows and vice versa. Assuming that you have created a nice LDAvis from your topicmodels approach, this should do the trick to get them sync up:

ldavis_data <- fromJSON(json_lda)
topic_order <- ldavis_data$topic.order
lda@gamma[,topic_order]
lda@beta[topic_order,] 

Also, if you want the phi and theta data that is displayed in LDAvis from a model generated with the topicmodels package, you can do the following:

lda_posterior <- posterior(lda)
lda_theta <- lda_posterior $topics[,topic_order]
lda_phi <- lda_posterior $terms[topic_order,]