Implementing Algorithm X in R

Asked by At

I am looking to implement something sort of like Knuth's Algorithm X in R.

The problem: I have a n x k matrix A, n>=k, with real-valued entries representing a cost. Both n and k are going to be pretty small in general (n<10, k<5). I want to find the mapping of rows onto columns that minimizes the total cost of the matrix, subject to the constraint that no single row can be used twice.

I think this is sort of like Algorithm X in that a reasonable approach seems to be:

  1. Pick a column in A and find the minimum value in it.
  2. Remove that row and that column. Now you're left with Asub.
  3. Go to Step 1 and repeat with Asub, and a new column selection, until ncol(Asub)=1.

But I can't figure out how to create a recursive data structure in R that will store the resulting tree of cell-level costs. Here's what I have so far, which only makes it down one branch, and so doesn't find the optimal solution.

# This version of the algorithm always selects the first column. We need to make it 
# traverse all branches.
algorithmX <- function(A) {
  for (c in 1:ncol(A)) {
    r <- which.min(A[,c])
    memory <- data.frame(LP_Number = colnames(A)[c], 
                         Visit_Number = rownames(A)[r], 
                         cost = as.numeric(A[r,c]))
    if (length(colnames(A))>1) {
      Ared <- A[-r, -c, drop=FALSE]
      return( rbind(memory, algorithmX(Ared)) )
    }
    else {
      return(memory)
    }
  }
}

foo <- c(8.95,3.81,1.42,1.86,4.32,7.16,12.86,7.59,5.47,2.12,
         0.52,3.19,13.97,8.79,6.52,3.37,0.91,2.03)
colnames(foo) <- paste0("col",c(1:3))
rownames(foo) <- paste0("row",c(1:6))
algorithmX(foo)

I'm sure I'm missing something basic in how to handle recursion in an R function. I'm also happy to hear other ways of solving this problem if this algorithm isn't actually the best fit.

2 Answers

1
ErinMcJ On Best Solutions

Thanks to user2554330 above for some pointers on how to structure a recursive function so that values are retained. I modified their code as follows, and now it appears to work, catching all the corner cases I had identified before that necessitated me writing this function in the first place!

algorithmX <- function(A) {
  best.match <- data.frame(LP_Number=numeric(), Visit_Number=numeric(), cost=numeric(), total.cost=numeric())
  for (c in 1:ncol(A)) {
    r <- which.min(A[,c])
    memory <- data.frame(LP_Number = colnames(A)[c], 
                         Visit_Number = rownames(A)[r], 
                         cost = as.numeric(A[r,c]),
                         total.cost = as.numeric(NA))
    if (length(colnames(A))>1) {
      Ared <- A[-r, -c, drop=FALSE]
      memory <- rbind(memory, algorithmX(Ared))
    }
    total.cost <- summarize(memory, sum(cost)) %>% unlist() %>% as.numeric()
    memory$total.cost <- total.cost
    if (length(best.match$total.cost)==0 | memory$total.cost[1] < best.match$total.cost[1]) {
      best.match <- memory
    }
  }
  return(best.match)
}
1
user2554330 On

You've missed setting up foo as a matrix, so you can't set colnames(foo) or rownames(foo). Assuming that's just a typo, there's also the issue that you never visit anything other than c = 1, because both branches of the inner test return something. You probably want to collect the results in the loop, pick the best one, and return that.

For example,

algorithmX <- function(A) {
  bestcost <- Inf
  save <- NULL
  for (c in 1:ncol(A)) {
    r <- which.min(A[,c])
    memory <- data.frame(LP_Number = colnames(A)[c], 
                         Visit_Number = rownames(A)[r], 
                         cost = as.numeric(A[r,c]))
    if (length(colnames(A))>1) {
      Ared <- A[-r, -c, drop=FALSE]
      memory <- rbind(memory, algorithmX(Ared)) 
    }
    if (sum(memory$cost) < bestcost) {
      bestcost <- sum(memory$cost)
      save <- memory
    }
  }
  return(save)
}