RandomLinkSplit not working with HeteroData

557 views Asked by At

I am having some serious trouble with torch-geometric when dealing with my own data. I am trying to build a graph that has 4 different node entities (of which only 1 bears some node features, the others are simple nodes), and 5 different edge type (of which only one bears a weight). I have managed to do so by building a HeteroData() object and loading the different matrices with labels, attributes and so on.

The problem arises when I try to call RandomLinkSplit. Here's what my call looks like:

import torch_geometric.transforms as T


transform = T.RandomLinkSplit(
              num_val = 0.1,
              num_test = 0.1,
              edge_types = [('Patient', 'suffers_from', 'Diagnosis'),
                             ('bla', 'bla', 'bla') #I copy all the edge types here
                           ],
              
            )

but I get the empty AssertionError on the condition:

assert is instance(rev_edge_types, list)

So I thought that I needed to transform the graph to undirected (for some weird reason) like the tutorial does, and then to sample also reverse edges (even though I don't need them):

import torch_geometric.transforms as T

data = T.ToUndirected()(data)
transform = T.RandomLinkSplit(
              num_val = 0.1,
              num_test = 0.1,
              edge_types = [('Patient', 'suffers_from', 'Diagnosis'),
                             ('bla', 'bla', 'bla') #I copy all the edge types here
                           ],
              rev_edge_types = [('Diagnosis', 'rev_suffers_from', 'Patient'),
                                ...
                               ]
              
            )

but this time I get the error unsupported operand type(s) for *: 'Tensor' and 'NoneType'.

Does any expert have any ideas on why this is happening? I am simply trying to do a train test split, and from the docs I read the Heterogeneous graphs should be well supported, but I don't understand why this is not working and I have been trying different things for quite a lot of time.

Any help would be appreciated! Thanks

1

There are 1 answers

0
Kaan Dönmez On

You should try to do split per edge and train on one edge type at a time.

transform = T.RandomLinkSplit(
              num_val = 0.1,
              num_test = 0.1,
              edge_types = ('Patient', 'suffers_from', 'Diagnosis'),
              rev_edge_types = ('Diagnosis', 'rev_suffers_from', 'Patient')
              
            )