2025-03-03 graphs lecture 11

[[lecture-data]]
Summary

1. Graph Signals and Graph Signal Processing

Recall from last time that injective GNNs are as powerful as the WL test:

Theorem

Let G1 and G2 be two graphs that the Weisfeiler-Leman Graph Isomorphism Test determines are non-isomorphic. A GNN Φ maps G1 and G2 to Φ(G1)Φ(G2) if the following hold:

  1. Φ aggregates and updates as
    (x)v=ϕ((x1)v,φ({(x1)u:uN(v)}))
    Where ϕ,φ are injective.
  2. The readout function is (permutation invariant) and injective.
Graph isomorphism network

A graph isomorphism network (GIN) layer is defined as

(x)v=σ(W((1+ε)(x1)v+uN(v)(x1)u))

We can write this layer in terms of concatenate function φ:

φ({(x1)u:uN(v)})=uN(v)(x1)u

And aggregate function ϕ

ϕ((x1)v,φ())=W((1+ε)(x1)v+φ())

The motivation for this architecture comes from the fact that injective GNNs are as powerful as the WL test, which holds in this case as long as ϕ,φ as defined above are injective.

see Graph Isomorphism Network

Here,

notes

ϕφ1 is a perceptron or MLP over the neighborhood multiset

  • Hornik 1989 says an MLP can model any injective function - this is due to the universal approximation theorem
  • Injective ϕXϕ1 exist for multiset X (proven in paper where GINs are introduced)

Can we map an injective function from multisets to embeddings using composition?

Example

2025-02-24-graph.png
Can a GIN tell them apart? We assume W0=1,ε0=0. At the first layer, we take the sum of each node's neighbors' embedding and add it to the node's embedding.

2025-03-03-graph.png
j=1
G1

  • x1=1+1=2
  • x2=1+1+1=3
  • x3=2
  • output is {2,3,2}
    G2
    x1=x2=x3=3
  • output is {3,3,3}

As long as we have an injective function from multisets to embeddings, we are OK since we will be able to distinguish between these. To confirm we have this, we look at the graph level readout.

In the GIN paper, the readout is defined in terms of a readout at each layer.

Example

For the example above,

XG=CONCAT(readout({(x)v:vV})|=1,2,,L)

And so

  • G1:[3|7]
  • G2:[3|9]
Note

  • The concatenation is not needed, but the authors claim better generalization
    • the order of the concatenation doesn't matter as long as it is consistent for each graph
  • with the sum as the readout, the GIN generalizes the WL test

Takeaway

thus, a GIN is a maximally powerful GNN in graph-level tasksfor anonymous inputs (ie x=1 )

see GINs are maximally powerful for anonymous input graphs

Is being as powerful as the WL test enough?

Example

Can WL test distinguish between the two graphs below?
2025-03-03-graph-1.png
The output of the color refinement algorithm for Graph 1 gives:

  • 1, 2, 3, 4, 5, 7, 8, 9, 10 have color 1
  • 3, 6 have color 2

And for Graph 2:

  • 1,2,5,6, 7,8,9,10 have color 1
  • 3,4 have color 2

So in this case, the multisets of the colors are the same between G1 and G2, and we cannot distinguish between them.

So, NO! Thus, the computational graphs are the same, despite these being different graphs.

Exercise

Check

Instead, we could count the number of cycles to see if these graphs are different.

If we had a GNN that could count cycles, we could discriminate between these graphs.

But GNNs are less powerful than the WL test, so GNNs could not discriminate between these graphs. And this implies that they cannot count cycles.

20250304-2025-03-03.png

There is another way to count cycles: densities of cycle homomorphisms

Graph Homomorphism

Let G=(V,E) and F=(V,E). A graph homomorphism from F to G is a map γ:FG such that

(i,j)E(γ(i),γ(j))E

ie, homomorphisms are adjacency-preserving maps (of F)

see graph homomorphism

Exercise

How is a homomorphism not the same as a graph isomorphism?

Answer

Recall the definition of the isomorphism

Graph Isomorphism

Let G and G be two graphs. A graph isomorphism between G and G is a bijection M:V(G)V(G) such that for all i,jV(G)
(i,j)E(G)(M(i),M(j))E(G)

The isomorphism requires that both graphs share all edges. The homomorphism requires that the second graph, G, is a subgraph of the first, F.

Example
+++see notes

A valid homomorphism is

We will denote the total number of homomorphisms from F to G as hom(F,G).

homomorphism density

Let G=(V,E) and F=(V,E) be graphs. The homomorphism density from F to G, denoted t(F,G) is given as

t(F,G)=hom(F,G)|V||V|

Where hom(F,G) is the the total number of homomorphisms from F to G

see homomorphism density

claim

Let Ck be the k-cycle. Let G be a graph with adjacency matrix eigenvalues λ1,λ2,,λn

For k4, we have

t(Ck,G)=(iλik)nk

Where t() is the homomorphism density

Proof

Note that

hom(Ck,G)=i,j,k,,(AijAjk)AkAmAζi=i,k,j(AijAjk)AkAmAζi=i,k,[A2]ikAklAmAζi==i[Ak]ii=Tr(Ak)=iλk

Thus t(ck,G)=iλiknk

Note

This implies that cycles can be counted using a convolutional GNN with white input (see Lecture 9) with graph shift operator S=An.

see cycle homomorphism density is given by the trace of the adjacency matrix

This is similar to We can verify whether graphs without node features and different laplacian eigenvalues are not isomorphic

To be able to count these cycles, we need the input to be white. For the GIN, we needed to have the inputs to be anonymous ie each xi=1.

For this method to work, we need each x^i=1, ie anonymous in the spectral domain.

This means that white inputs help improve expressivity (as opposed to anonymous inputs xi=1i)

The "best of both worlds" can be achieved by merging the two methods: using a GIN with white inputs

Since designing a white inputs requires computing eigendecomposition, we usually relax this condition to random inputs instead, which almost surely satisfy x^i=1.

Takeaway

For maximum expressivity, we can use a GIN with white, anonymous inputs. That is,

  • Each xi=1
  • Each x^i=1

However, designing a white inputs requires computing eigendecomposition which is expensive.

Instead, we usually relax this condition to random inputs, which almost surely satisfy x^i=1.

see random graphs in a gin are good for graph isomorphism

TBC in HW2

Detour:

Detour:

see lecture 11 notebook
torch_geometric

Node features represented as n×d tensor

to_networkx from torch_geometric.utils
to convert to networkx object to plot :)

In general, going to use benchmark datasets

see notebook for defining a GNN in pyG - have layers implemented already