Execute this notebook: Download locally

Interpreting nodes and edges with saliency maps in GCN (sparse)

This demo shows how to use integrated gradients in graph convolutional networks to obtain accurate importance estimations for both the nodes and edges. The notebook consists of three parts: - setting up the node classification problem for Cora citation network - training and evaluating a GCN model for node classification - calculating node and edge importances for model’s predictions of query (“target”) nodes

References

[1] Axiomatic Attribution for Deep Networks. M. Sundararajan, A. Taly, and Q. Yan. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017 (link).

[2] Adversarial Examples on Graph Data: Deep Insights into Attack and Defense. H. Wu, C. Wang, Y. Tyshetskiy, A. Docherty, K. Lu, and L. Zhu. arXiv: 1903.01610 (link).

[3]:
import networkx as nx
import pandas as pd
import numpy as np
from scipy import stats
import os
import time
import stellargraph as sg
from stellargraph.mapper import FullBatchNodeGenerator
from stellargraph.layer import GCN
from tensorflow import keras
from tensorflow.keras import layers, optimizers, losses, metrics, Model, regularizers
from sklearn import preprocessing, feature_extraction, model_selection
from copy import deepcopy
import matplotlib.pyplot as plt
from stellargraph import datasets
from IPython.display import display, HTML
%matplotlib inline

Loading the CORA network

(See the “Loading from Pandas” demo for details on how data can be loaded.)

[4]:
dataset = datasets.Cora()
display(HTML(dataset.description))
G, subjects = dataset.load()
The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words.

Splitting the data

For machine learning we want to take a subset of the nodes for training, and use the rest for validation and testing. We’ll use scikit-learn again to do this.

Here we’re taking 140 node labels for training, 500 for validation, and the rest for testing.

[5]:
train_subjects, test_subjects = model_selection.train_test_split(
    subjects, train_size=140, test_size=None, stratify=subjects
)
val_subjects, test_subjects = model_selection.train_test_split(
    test_subjects, train_size=500, test_size=None, stratify=test_subjects
)

Converting to numeric arrays

For our categorical target, we will use one-hot vectors that will be fed into a soft-max Keras layer during training. To do this conversion …

[6]:
target_encoding = preprocessing.LabelBinarizer()

train_targets = target_encoding.fit_transform(train_subjects)
val_targets = target_encoding.transform(val_subjects)
test_targets = target_encoding.transform(test_subjects)

all_targets = target_encoding.transform(subjects)

Creating the GCN model in Keras

To feed data from the graph to the Keras model we need a generator. Since GCN is a full-batch model, we use the FullBatchNodeGenerator class.

[7]:
generator = FullBatchNodeGenerator(G, sparse=True)
Using GCN (local pooling) filters...

For training we map only the training nodes returned from our splitter and the target values.

[8]:
train_gen = generator.flow(train_subjects.index, train_targets)

Now we can specify our machine learning model: tn this example we use two GCN layers with 16-dimensional hidden node features at each layer with ELU activation functions.

[9]:
layer_sizes = [16, 16]
gcn = GCN(
    layer_sizes=layer_sizes,
    activations=["elu", "elu"],
    generator=generator,
    dropout=0.3,
    kernel_regularizer=regularizers.l2(5e-4),
)
[10]:
# Expose the input and output tensors of the GCN model for node prediction, via GCN.in_out_tensors() method:
x_inp, x_out = gcn.in_out_tensors()
# Snap the final estimator layer to x_out
x_out = layers.Dense(units=train_targets.shape[1], activation="softmax")(x_out)

Training the model

Now let’s create the actual Keras model with the input tensors x_inp and output tensors being the predictions x_out from the final dense layer

[11]:
model = keras.Model(inputs=x_inp, outputs=x_out)

model.compile(
    optimizer=optimizers.Adam(lr=0.01),  # decay=0.001),
    loss=losses.categorical_crossentropy,
    metrics=[metrics.categorical_accuracy],
)

Train the model, keeping track of its loss and accuracy on the training set, and its generalisation performance on the validation set (we need to create another generator over the validation data for this)

[12]:
val_gen = generator.flow(val_subjects.index, val_targets)

Train the model

[13]:
history = model.fit(
    train_gen, shuffle=False, epochs=20, verbose=2, validation_data=val_gen
)
Epoch 1/20
1/1 - 0s - loss: 2.0886 - categorical_accuracy: 0.0786 - val_loss: 1.8735 - val_categorical_accuracy: 0.2860
Epoch 2/20
1/1 - 0s - loss: 1.8273 - categorical_accuracy: 0.3071 - val_loss: 1.7735 - val_categorical_accuracy: 0.3080
Epoch 3/20
1/1 - 0s - loss: 1.6816 - categorical_accuracy: 0.3429 - val_loss: 1.7049 - val_categorical_accuracy: 0.3280
Epoch 4/20
1/1 - 0s - loss: 1.5697 - categorical_accuracy: 0.3714 - val_loss: 1.6350 - val_categorical_accuracy: 0.4240
Epoch 5/20
1/1 - 0s - loss: 1.4508 - categorical_accuracy: 0.5000 - val_loss: 1.5633 - val_categorical_accuracy: 0.4860
Epoch 6/20
1/1 - 0s - loss: 1.3410 - categorical_accuracy: 0.5929 - val_loss: 1.4933 - val_categorical_accuracy: 0.5100
Epoch 7/20
1/1 - 0s - loss: 1.2154 - categorical_accuracy: 0.6714 - val_loss: 1.4239 - val_categorical_accuracy: 0.5420
Epoch 8/20
1/1 - 0s - loss: 1.1221 - categorical_accuracy: 0.6714 - val_loss: 1.3527 - val_categorical_accuracy: 0.5540
Epoch 9/20
1/1 - 0s - loss: 1.0248 - categorical_accuracy: 0.7286 - val_loss: 1.2816 - val_categorical_accuracy: 0.5820
Epoch 10/20
1/1 - 0s - loss: 0.9370 - categorical_accuracy: 0.7429 - val_loss: 1.2150 - val_categorical_accuracy: 0.6100
Epoch 11/20
1/1 - 0s - loss: 0.8205 - categorical_accuracy: 0.7929 - val_loss: 1.1561 - val_categorical_accuracy: 0.6420
Epoch 12/20
1/1 - 0s - loss: 0.7672 - categorical_accuracy: 0.8214 - val_loss: 1.1058 - val_categorical_accuracy: 0.6840
Epoch 13/20
1/1 - 0s - loss: 0.6830 - categorical_accuracy: 0.8500 - val_loss: 1.0636 - val_categorical_accuracy: 0.7120
Epoch 14/20
1/1 - 0s - loss: 0.6202 - categorical_accuracy: 0.8786 - val_loss: 1.0272 - val_categorical_accuracy: 0.7220
Epoch 15/20
1/1 - 0s - loss: 0.5606 - categorical_accuracy: 0.9143 - val_loss: 0.9955 - val_categorical_accuracy: 0.7380
Epoch 16/20
1/1 - 0s - loss: 0.5297 - categorical_accuracy: 0.9071 - val_loss: 0.9688 - val_categorical_accuracy: 0.7560
Epoch 17/20
1/1 - 0s - loss: 0.4936 - categorical_accuracy: 0.9429 - val_loss: 0.9467 - val_categorical_accuracy: 0.7600
Epoch 18/20
1/1 - 0s - loss: 0.4496 - categorical_accuracy: 0.9571 - val_loss: 0.9290 - val_categorical_accuracy: 0.7740
Epoch 19/20
1/1 - 0s - loss: 0.4013 - categorical_accuracy: 0.9643 - val_loss: 0.9158 - val_categorical_accuracy: 0.7780
Epoch 20/20
1/1 - 0s - loss: 0.3808 - categorical_accuracy: 0.9786 - val_loss: 0.9063 - val_categorical_accuracy: 0.7820
[14]:
sg.utils.plot_history(history)
../../_images/demos_interpretability_gcn-sparse-node-link-importance_30_0.png

Evaluate the trained model on the test set

[15]:
test_gen = generator.flow(test_subjects.index, test_targets)
test_metrics = model.evaluate(test_gen)
print("\nTest Set Metrics:")
for name, val in zip(model.metrics_names, test_metrics):
    print("\t{}: {:0.4f}".format(name, val))

Test Set Metrics:
        loss: 0.9140
        categorical_accuracy: 0.7843

Execute this notebook: Download locally