Can I manually create an RWeka decision (Recursive Partitioning) tree?

738 views Asked by At

I have constructed a J48 decision tree using RWeka. I would like to compare its performance to a decision tree described an existing (externally computed) decision tree. I'm new to RWeka and I'm having trouble manually creating an RWeka decision tree. Ideally, I would like to show the two side-by-side and plot them using the RWeka visualization (It is very informative and clean).

Right now, I'm going to export the RWeka computed decision tree to Graphviz and manipulate it into the structure I want. I want to check before I start and make sure I cant simply specify the rules I want to manually specify a decision tree.

I don't want to compute the decision tree (I've done that), I want to manually construct/specify a decision tree (for uniform comparison in my presentation).

Thank you in advanced.

2

There are 2 answers

3
Achim Zeileis On BEST ANSWER

The RWeka package itself cannot do that . However, RWeka uses the partykit package for displaying its trees which can do what you want. Look at the vignette(“partykit“, package = “partykit“) how you can construct a recursive partynode object with pre-specified partysplits and then turn them into a constparty. The vignette has a hands-on example for this.

0
user3030872 On

Here is some example code for the package partykit that @Achim Zeileis suggested.

library(partykit)

Load the data:

data("WeatherPlay", package = "partykit")
WeatherPlay
#  outlook temperature humidity windy play
#  1 sunny 85 85 false no
#  2 sunny 80 90 true no
#  3 overcast 83 86 false yes
#  4 rainy 70 96 false yes
#  5 rainy 68 80 false yes
#  6 rainy 65 70 true no
#  7 overcast 64 65 true yes
...

Initialize decisions: integer 1L denotes the column of the yet unspecified data-frame to which this split applies. Index corresponds to the levels of a factor (discrete splits) and breaks corresponds to a cutoff (continuous splits).

sp_o <- partysplit(1L, index = 1:3)
sp_h <- partysplit(3L, breaks = 75)
sp_w <- partysplit(4L, index = 1:2)

Incorporate decisions into nodes:

pn <- partynode(1L, split = sp_o, kids = list(
  partynode(2L, split = sp_h, kids = list(
  partynode(3L, info = "yes"),
  partynode(4L, info = "no"))),
  partynode(5L, info = "yes"),
  partynode(6L, split = sp_w, kids = list(
  partynode(7L, info = "yes"),
  partynode(8L, info = "no")))))

Fit data to tree:

t2 <- party(pn,
  data = WeatherPlay,
  fitted = data.frame(
    "(fitted)" = fitted_node(pn, data = WeatherPlay),
    "(response)" = WeatherPlay$play, # response variable
  check.names = FALSE),
  terms = terms(play ~ ., data = WeatherPlay),
  )

t3 <- as.constparty(t2)
plot(t3)

source: http://cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf

enter image description here