How to generate a sample using n-order Markov Chains with R?

105 views Asked by At

I'm attempting to generate a sample from an n-order transition matrix using Markov Chains in R. I've successfully constructed this n-order transition matrix using the following code:

set.seed(1)

dat <- sample(c("A", "B", "C"), size = 2000, replace = TRUE) # Data

n <- 2 # Order of the transition matrix
if (n > 1) {
  from <- head(apply(embed(dat, n)[, n:1], 1, paste, collapse = ""), -1)
  to <- dat[-1:-n]
} else {
  from <- dat[-length(dat)]
  to <- dat[-1]
}

fromTo <- data.frame(cbind(from, to))
TM <- table(fromTo)
TM <- TM / rowSums(TM) # Transition matrix

However, I'm facing difficulties in writing a code that generates a sample using the generated transition matrix which adapts to varying values of n. Is there a way to do it?

Ideally, I'd prefer a solution that doesn't involve the 'markovchain' package due to compatibility issues across different R versions.

1

There are 1 answers

2
ThomasIsCoding On BEST ANSWER

Update

If you are just wondering how to generate a sample from the given transition matrix, you can try the code below for example (on top of the MarkovChain function built in the previous answer)

MarkovChainSampling <- function(dat, ord, preStat){
  TM <- MarkovChain(dat, ord)
  sample(colnames(TM), 1, prob = TM[preStat, ])
}

such that

> MarkovChainSampling(dat, 2, "A")
[1] "C"

> MarkovChainSampling(dat, 3, "AB")
[1] "A"

> MarkovChainSampling(dat, 4, "AAA")
[1] "C"

Previous

I think you are after the transition matrix of Markov Chain of order n. Below is one option where you might find some clues.

You can use embed like below

MarkovChain <- function(dat, ord) {
  d <- as.data.frame(embed(dat, ord))
  df <- with(
    d,
    data.frame(
      pre = do.call(paste, c(d[-ord], sep = "")),
      cur = d[[ord]]
    )
  )
  proportions(table(df), 1)
}

and you will obtain

> MarkovChain(dat, 2)
   cur
pre         A         B         C
  A 0.3377386 0.3509545 0.3113069
  B 0.3333333 0.3348281 0.3318386
  C 0.3513097 0.3174114 0.3312789

> MarkovChain(dat, 3)
    cur
pre          A         B         C
  AA 0.3347826 0.3826087 0.2826087
  AB 0.3430962 0.3263598 0.3305439
  AC 0.3396226 0.3160377 0.3443396
  BA 0.3273543 0.2959641 0.3766816
  BB 0.3392857 0.3482143 0.3125000
  BC 0.3783784 0.3063063 0.3153153
  CA 0.3524229 0.3700441 0.2775330
  CB 0.3155340 0.3300971 0.3543689
  CC 0.3348837 0.3302326 0.3348837