How to get lists of descendant tip names of a data.tree object recursively?

196 views Asked by At

I'm using the data.tree package in R and I have a tree like the one below, where only the leaf nodes have labels. I am trying to assign attributes that are lists of the descendants (we can assume the tree is always bifurcating).

tree <- read.tree(text = "((A,B),C);")
tree <- as.Node(tree)
print(tree)
  levelName
1 4        
2  ¦--5    
3  ¦   ¦--A
4  ¦   °--B
5  °--C    

So what I'm looking for is something like this, except for node 4 the attribute should be (A,B) where it says 5.

tree$Do(function(node) node$desc1 <- as.vector(sapply(node$children, function(x) x$name))[1],
         traversal = "post-order", filterFun = isNotLeaf)
tree$Do(function(node) node$desc2 <- as.vector(sapply(node$children, function(x) x$name))[2],
         traversal = "post-order", filterFun = isNotLeaf)
print(tree, "desc1", "desc2")
  levelName desc1 desc2
1 4             5     C
2  ¦--5         A     B
3  ¦   ¦--A            
4  ¦   °--B            
5  °--C   

So I don't just need the children node names, I need it to recursively save the descendant names as lists and assign them as it goes from tips to root.

  levelName desc1 desc2
1 4           (A,B)   C
2  ¦--5         A     B
3  ¦   ¦--A            
4  ¦   °--B            
5  °--C  

Big picture: I eventually will need to traverse the tree and pass these lists into another function i.e. doSomething( group1 = "C", group2 = c("A","B") ) and save the output of that function as an attribute of the internal nodes.

The closest I've gotten is this recursive function but it gives the same output as above

getDescendant <- function(node) {
  if(node$isLeaf == FALSE) {
    result <- as.vector(sapply(node$children, function(x) x$name))
  } else {
    result <-sapply(node$children, getDescendant)
  }
  return(result)
}
print(tree, desc = getDescendant)
  levelName desc
1 4         5, C
2  ¦--5     A, B
3  ¦   ¦--A     
4  ¦   °--B     
5  °--C   

EDIT: Here is what I got to work, but it's not generalizable to trees that aren't bifurcating.

tree  <- read.tree(text = "((A,B),C);")
tree <- as.Node(tree)

########## Collect descendant clades for each node
get_descendants <- function(node) {
  if(node$children[[1]]$isLeaf == TRUE && node$children[[2]]$isLeaf == TRUE) {
    node$desc1 <- node$children[[1]]$name
    node$desc2 <- node$children[[2]]$name
  } else if(node$children[[1]]$isLeaf == FALSE && node$children[[2]]$isLeaf == FALSE) {
    node$desc1 <- c(as.vector(node$children[[1]]$desc1), as.vector(node$children[[1]]$desc2))
    node$desc2 <- c(as.vector(node$children[[2]]$desc1), as.vector(node$children[[2]]$desc2))
  } else {
    if(node$children[[1]]$isLeaf == TRUE && node$children[[2]]$isLeaf == FALSE) {
      node$desc1 <- node$children[[1]]$name
      node$desc2 <- c(as.vector(node$children[[2]]$desc1), as.vector(node$children[[2]]$desc2))
    } else {
      node$desc1 <- c(as.vector(node$children[[1]]$desc1), as.vector(node$children[[1]]$desc2))
      node$desc2 <- node$children[[2]]$name
    }
  }
}
tree$Do(function(node) get_descendants(node), filterFun=isNotLeaf, 
         traversal = "post-order")
print(tree, "desc1", "desc2")
  levelName desc1 desc2
1 4          A, B     C
2  ¦--5         A     B
3  ¦   ¦--A            
4  ¦   °--B            
5  °--C  
0

There are 0 answers