I am struggling to create just a simple Graph Neural Network to predict the context features - m,s. I have spent alot of time looking through the docs and example models, but I am struggling to understand how to build one. I keep running into errors when trying to build the model, I just want to make a simple model and then make it more complex over time. For my code, I have a graph with node_sets called 'points' with 3 features - a,b,c. The edge_set is called 'bidirectional chain' and has 3 features as well - delta_a, delta_b, delta_c. Here is the following code for my preparing the data into training and validation data:
def decode_fn(proto):
feature_description = {
'serialized_graph_tensor': tf.io.FixedLenFeature([], tf.string),
}
parsed_features = tf.io.parse_single_example(proto, feature_description)
graph_tensor = tfgnn.parse_single_example(graph_tensor_spec, parsed_features['serialized_graph_tensor'])
context_features = graph_tensor.context.get_features_dict()
m = context_features.pop('mass')
s = context_features.pop('scale_factor')
new_graph = graph_tensor.replace_features(context=context_features)
labels = tf.stack([m, s], axis=1)
return new_graph, labels
graph_schema = tfgnn.read_schema(filepath_graph_schema)
graph_tensor_spec = tfgnn.create_graph_spec_from_schema_pb(graph_schema)
raw_dataset = tf.data.TFRecordDataset([filepath_graph_dataset])
buffer_size = 10000 # Adjust this number based on your dataset size and memory constraints
shuffled_dataset = raw_dataset.shuffle(buffer_size=buffer_size)
total_size = sum(1 for _ in raw_dataset)
# Calculate the sizes of training and validation sets
train_size = int(total_size * 0.8)
val_size = total_size - train_size
# Split dataset
train_dataset = shuffled_dataset.take(train_size)
val_dataset = shuffled_dataset.skip(train_size)
train_dataset = train_dataset.map(decode_fn)
val_dataset = val_dataset.map(decode_fn)
batch_size = 32
train_dataset_batched = train_dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)
val_dataset_batched = val_dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)
I think this bit is correct, The next part - to create the model - is where I am struggling. I understand it's meant to be in the following form:
def gnn(graph):
gcn_layer = gcn_conv.GCNConv(3)
graph = gcn_layer(graph, edge_set_name='bidirectional_chain')
return graph
def set_initial_node_state(node_set, node_set_name):
if node_set_name == "points":
return tf.keras.layers.Concatenate()(
[node_set["a"], node_set['b'], node_set['c']])
model_input_graph_spec, label_spec = train_dataset.element_spec
input_graph = tf.keras.layers.Input(type_spec=model_input_graph_spec)
graph = tfgnn.keras.layers.MapFeatures(
node_sets_fn=set_initial_node_state)(input_graph)
graph = gnn(graph)
pooled_features = tfgnn.keras.layers.Pool(
tfgnn.CONTEXT, "mean", node_set_name="points")(graph) # ERROR here
logits = tf.keras.layers.Dense(2)(pooled_features)
model = tf.keras.Model(input_graph, logits)
loss = tf.keras.losses.BinaryCrossentropy(from_logits=True)
metrics = [tf.keras.metrics.BinaryAccuracy(threshold=0.),
tf.keras.metrics.BinaryCrossentropy(from_logits=True)]
model.compile(tf.keras.optimizers.Adam(), loss=loss, metrics=metrics)
model.summary()
history = model.fit(train_dataset_batched,
steps_per_epoch=10,
epochs=200,
validation_data=val_dataset_batched)
But it keeps running into errors at the pooled_features line as shown below:
AttributeError: Exception encountered when calling layer "pool_4" (type Pool).
'SymbolicTensor' object has no attribute 'rank'
Call arguments received by layer "pool_4" (type Pool):
graph=tf.Tensor(shape=(None, 3), dtype=float32) tag=None reduce_type=None edge_set_name=None node_set_name=None feature_name=None
How to resolve this?