Extracting contents from decision tree J48

193 views Asked by At

I have the following decision tree (created by JWEKA package - by the command J48(NSP~., data=training) ):

[[1]]                                                               
J48 pruned  tree                                                        
------------------                                                              

MSTV    <=  0.4                                                     
|   MLTV    <=  4.1:    3   -2                                          
|   MLTV    >   4.1                                                 
|   |   ASTV    <=  79                                              
|   |   |   b   <=  1383:00:00  2   -18                                 
|   |   |   b   >   1383                                            
|   |   |   |   UC  <=  05:00   1   -2                              
|   |   |   |   UC  >   05:00   2   -2                              
|   |   ASTV    >   79:00:00    3   -2                                      
MSTV    >   0.4                                                     
|   DP  <=  0                                                   
|   |   ALTV    <=  09:00   1   (170.0/2.0)                                     
|   |   ALTV    >   9                                               
|   |   |   FM  <=  7                                           
|   |   |   |   LBE <=  142:00:00   1   (27.0/1.0)                              
|   |   |   |   LBE >   142                                     
|   |   |   |   |   AC  <=  2                                   
|   |   |   |   |   |   e   <=  1058:00:00  1   -5                      
|   |   |   |   |   |   e   >   1058                                
|   |   |   |   |   |   |   DL  <=  04:00   2   (9.0/1.0)                   
|   |   |   |   |   |   |   DL  >   04:00   1   -2                  
|   |   |   |   |   AC  >   02:00   1   -3                          
|   |   |   FM  >   07:00   2   -2                                  
|   DP  >   0                                                   
|   |   DP  <=  1                                               
|   |   |   UC  <=  03:00   2   (4.0/1.0)                                   
|   |   |   UC  >   3                                           
|   |   |   |   MLTV    <=  0.4:    3   -2                              
|   |   |   |   MLTV    >   0.4:    1   -8                              
|   |   DP  >   01:00   3   -8                                      

Number  of  Leaves  :   16                                              

Size    of  the tree    :   31

I would like to extract the nodes' values in 2 formats: one format only the name of the property such as: MSTV, MLTV, DP... etc., So each level of the tree will be followed by his parent, in the above case I would like to get the '(' as separator between each level such as:

(MSTV (MLTV...) (DP...) )

In the second format I would like to get the nodes with their values such as:

(MSTV 0.4 (MLTV 4.1 ....) (DP 0..... ) )

How can I extract the relevant information. I think to separate between the node values we should separate the characters by using gsub("[A-Z]:", "", string) But we need to ignore the last lines. Thanks a lot for your help.

0

There are 0 answers