Extract link and node attributes from OCG cluster in R

44 views Asked by At

I have a cluster object applying getOCG.clusters() function (Overlapping Cluster Generator) in {linkcomm} package in R.

Here is an output for reproduction.

graph <- structure(list(numbers = c(22L, 18L, 12L), modularity = 599L, 
    Q = 0.5341, nodeclusters = structure(list(node = c("isstilt=0", 
    "roo_main=6", "roo_main=2", "stsub2mainth=tsuma", "roo_main=3", 
    "st_adsb=add", "roo_main=7", "st_adsb=sub", "st_sub2_main_th=hira", 
    "st_sub_main_th=other", "st_adsb=add", "st_con_rt=main-room", 
    "st_th=tsuma", "st_adsb=sub", "stsub2mainth=tsuma", "st_con_rt=sub-room", 
    "st_sub_main_th=hira", "stsub2mainth=tsuma", "st_con_tr=direct", 
    "stsub2mainth=tsuma", "st_adsb=add", "st_con_tr=terrace", 
    "st_con_tr=direct", "st_sub_main_th=other", "st_con_rt=sub-room", 
    "st_sub_main_th=tsuma", "st_con_rt=main-room", "st_th=hira"),
    cluster = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 
    5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 11L, 
    11L, 12L, 12L)), class = "data.frame", row.names = c(NA, 
    -28L)), numclusters = c(`stsub2mainth=tsuma` = 4, `st_adsb=add` = 3, 
    `st_adsb=sub` = 2, `st_con_rt=main-room` = 2, `st_con_rt=sub-room` = 2, 
    `st_con_tr=direct` = 2, `st_sub_main_th=other` = 2, `roo_main=6` = 1, 
    `st_sub2_main_th=hira` = 1, `roo_main=3` = 1, `st_sub_main_th=tsuma` = 1, 
    `st_th=hira` = 1, `st_th=tsuma` = 1, `st_sub_main_th=hira` = 1, 
    `st_con_tr=terrace` = 1, `isstilt=0` = 1, `roo_main=7` = 1, 
    `roo_main=2` = 1), igraph = structure(list(18, FALSE, c(1, 
    3, 4, 5, 6, 7, 8, 9, 7, 7, 10, 12, 13, 15, 16, 16, 8, 12, 
    12, 15, 12, 17), c(0, 2, 2, 2, 2, 4, 7, 7, 5, 6, 4, 11, 6, 
    14, 15, 12, 6, 8, 5, 12, 6, 12), NULL, NULL, NULL, NULL, 
        list(c(1, 0, 1), structure(list(), names = character(0)), 
            list(name = c("roo_main=6", "isstilt=0", "st_sub_main_th=other", 
            "roo_main=7", "st_adsb=sub", "st_con_tr=direct", 
            "st_con_rt=sub-room", "stsub2mainth=tsuma", "st_sub_main_th=hira", 
            "roo_main=2", "st_sub2_main_th=hira", "roo_main=3", 
            "st_adsb=add", "st_sub_main_th=tsuma", "st_th=hira", 
            "st_con_rt=main-room", "st_th=tsuma", "st_con_tr=terrace")), list())), class = "igraph"), edgelist = structure(c("roo_main=6", 
    "st_sub_main_th=other", "st_sub_main_th=other", "st_sub_main_th=other", 
    "st_sub_main_th=other", "stsub2mainth=tsuma", "stsub2mainth=tsuma", 
    "stsub2mainth=tsuma", "stsub2mainth=tsuma", "stsub2mainth=tsuma", 
    "st_sub2_main_th=hira", "roo_main=3", "st_sub_main_th=tsuma", 
    "st_th=hira", "st_th=tsuma", "st_th=tsuma", "st_sub_main_th=hira", 
    "st_sub_main_th=hira", "st_con_tr=direct", "st_con_rt=main-room", 
    "st_con_rt=sub-room", "st_con_tr=terrace", "isstilt=0", "roo_main=7", 
    "st_adsb=sub", "st_con_tr=direct", "st_con_rt=sub-room", 
    "st_adsb=sub", "st_sub_main_th=hira", "roo_main=2", "st_con_tr=direct", 
    "st_con_rt=sub-room", "st_adsb=sub", "st_adsb=add", "st_con_rt=sub-room", 
    "st_con_rt=main-room", "st_con_rt=main-room", "st_adsb=add", 
    "st_con_rt=sub-room", "st_adsb=add", "st_adsb=add", "st_adsb=add", 
    "st_adsb=add", "st_adsb=add"), dim = c(22L, 2L)), clustsizes = c(`4` = 4L, 
    `5` = 3L, `7` = 3L, `1` = 2L, `2` = 2L, `3` = 2L, `6` = 2L, 
    `8` = 2L, `9` = 2L, `10` = 2L, `11` = 2L, `12` = 2L)), class = "OCG")

A plot from this file looks like this:

plot(graph, type="graph")

enter image description here

This graph shows a result of community detection via soft clustering, thus each node can belong to several communities (each edge belongs to single community). Percentage of participation to each community is shown as pie chart in each node.

I want to extract these numerical value (degree of participation to each community per node and community to which each edge belong) to join them with an original graph object so that I can draw graphs with more information via ggraph as the original graph contains additional information in nodes/edges for visualizing, e.g., changing size of nodes based on number of curresponded transactions, coloring edges with community group, changing width of edges based on rule counts, etc.

I have confirmed vignette of this package and tried some of them, but it seems that there is no function to extract the percentage or other numerical data to determine it. The above reproduction data itself does not seem to have it although it can be illustrated in plot.

Explanation of values in getLinkCommunities in page 19 seems not containing such detailed inforamtion.

It is appreciated if anyone know about the detail of this data structure.

0

There are 0 answers