2025-03-24 graphs lecture 14

[[lecture-data]]

Summary

last time

today ([[Lecture 14.pdf]])

  • what if my graph grows/too large?
  • Graph limit: graphon
Homework 2 notes

  • val acc not stable for expressivity : normal since small val set. Loss smooth
  • stability: 10-30% goal to see stable degrades for different models
    • gnn slower than for filters
    • more layers = more stable for gnn (more nonlinearities)
    • discuss degradation
  • Last prob
    • penalize vs not penalized
    • plot model 2’ vs mod 2

Announcement on final project

  • 1-2 ppl
  • either novel graph data/train model : 10min presentation
  • research paper presentation on any modern GNN/graph DS (past 5 years): 20 min presentation
    • review/discussion


Limits of Graphs

Last time, we looked at different graph perturbations and how to design stable GNNs. We saw how GNNs inherit stability from their layers and that GNNs perform better than their constituent filters because of the scattering of spectral information by the nonlinear activation function.

Question

  1. What if the graph grows?
  2. What if the graph is too large and I don’t have enough resources to train on it? (recall GNN forward pass is O(LK|E|) where L is the number of layers and K is the order of the filter polynomial)

To address problem 2 (what if the graph is too large?), we can subsample the graph with a random graph model. We will talk about this later.

Today, we turn to problem 1 (what if the graph grows?). To address this, we want to know

Question

  1. Can we measure how “close” two large graphs are? ie, can we define a metric for two (large) graphs?
  • If we can, then we can see if a GNN is continuous. If we can prove the GNN is continuous, then we are OK (perturbations will affect the function in a predictable way). Then,
  1. Can we prove the GNN is continuous WRT to this metric?

What does it mean for two graphs to be “close”? We’ll say this happens when they converge to a common limit.

defining a metric between graphs

What does it mean for graphs to be “close”?

To measure the subgraph counts, we can use homomorphism density. Recall from lecture 11 the definition of a graph homomorphism

Graph Homomorphism

Let G=(V,E) and F=(V,E). A graph homomorphism from F to G is a map γ:FG such that
(i,j)E(γ(i),γ(j))E
ie, homomorphisms are adjacency-preserving maps (of F; the adjacency of F must be preserved in G.)

Example

20250324-2025-03-24-graphs.png
There are a few homomorphisms from F to G:
20250324-2025-03-24-graphs-1.png

  • (a,b,c)(1,2,3)
    • And trivially, every permutation of (a,b,c) is also a homomorphism here

There are also:

  • (a,b,c)(1,2,4)
  • (a,b,c)(1,3,4)
  • (a,b,c)(2,3,4)
    and for each of the above, each permutation of (a,b,c) as well.

The homomorphism count hom(F,G)=24 in this case.

Recall also the definition for the homomorphism density:

homomorphism density

Let G=(V,E) and F=(V,E) be graphs. The homomorphism density from F to G, denoted t(F,G) is given as
t(F,G)=hom(F,G)|V||V|
Where hom(F,G) is the the total number of homomorphisms from F to G.

We can think of this as the probability of sampling a version of F from the graph G.

Example

in the example above,

t(F,G)=2443=38

This equates to the probability of sampling a triangle from G when we sample 3 vertices.

The astute reader will see that we can use the homomorphism density to compare graphs from the statistics POV of sampling. To do this, we need some subgraphs that serve as references for the types of counts we want (eg. triangles).

motif

Let F be a graph. F is a motif if it is

(see graph motif)

We can use this to create an idea of “closeness” for two graphs G1,G2. It makes sense that G1 and G2 are “close” if for all motifs F, we have t(F,G1)t(F,G2).

To formalize this idea into a definition, we first need to define what a convergent graph sequence is.

Convergent graph sequences and graphons

More info from Levàsz, Chayes, Borgs, Vesztergambi starting 2008-on

To begin, we need to define the “limit” of a sequence of graphs as the number of nodes n grows.

We call the limit of a sequence of graphs of increasing node count a {graphon}

graphon

A graphon is a symmetric, bounded, measurable function

W:[0,1]2[0,1]

ie, it is a bounded kernel (weighting function).

We can think of a graphon as a graph with an uncountable number of nodes u[0,1] and edges (u,v) with weights W(u,v).

(see graphon)

Note

A more general definition is

W:Ω×Ω[0,1]

where Ω is a sample space with some probability measure f. We can always map Ω onto [0,1] using a measure-preserving map (as long as the CDF of f is strictly monotone), so we can use [0,1]2 WLOG.

Exercise

Show that if the CDF of f is strictly monotone, we can map Ω to the interval [0,1] with a measure-preserving map.

Note

The codomain of W may be (0,B] where BR<+. However, we usually have B=1 since W(u,v) often represents a probability.

Example

20250324-graphon.png
From left to right:

  1. block model with same community sizes
  2. different community sizes, and
  3. w(u,v)=exp((uv)2σ2)

If the limit of the graph sequence Gn is a graphon, then the limit of the graph homomorphism density sequence t(F,Gn) must be the {graphon homomorphism density.}

graphon homomorphism density

The density from a graph motif F=(V,E) to a graphon W is

t~(F,W)=[0,1]|V|(i,j)EW(ui,uj)iVdui

This is the generalization of the definition of the homomorphism density

If W(u,v)<1 for all u,v, then t(F,W) can be interpreted as {the probability of sampling the motif F from the graphon W.}

Example

Erdos-Renyi with p=0.4
20250324-2025-03-24-graphs-2.png

(see graphon homomorphism density)

limnt(F,Gn)=t~(F,W),,,F Gnn is a convergent graph sequence with limit W.

Sequence of Convergent Graphs

Let Gnn be a graph sequence such that limnt(F,Gn) exists for all motifs F. Then this sequence is a convergent graph sequence and its limit is given by a graphon.

(see convergent graph sequence)

metric on graphs

While t(F,G) gives us a way to define convergent sequences of graphs, we can’t use it to measure a distance between graphs without computing t(F,G1) and t(F,G2) for all F.

We want to use a metric that is easier to calculate. Eventually, we want a distance for arbitrary graphs G and G which may have different numbers of nodes. However, we begin by building our definition from the simplest case where the node counts and labels are the same.

same node count and labels

Let G and G be graphs with the same n and the same node sets V=V (ie, same node labelling).

L1 norm for graphs

L1 norm or “edit distance”

d1(G,G)=||AA||1$$where$A,A$aretheunweighted[[ConceptWiki/adjacencymatrixadjacencymatrices]]of$G,G$respectively.

see graph edit distance

Cut Norm

Let B be a matrix. The cut norm is given as

||B||=maxS[n],T[n]|sS,tTBts|

see cut norm

cut distance (same node count, same labels)

Let G and G be graphs with the same n and the same node sets V=V (ie, same node labelling). The cut distance is given by

d(G,G)=||AA||

where |||| is the cut norm.

The cut distance will be our metric of choice in most cases.

same node count, different labels

cut distance (same node count, potentially different labels)

Let G and G have the same number of nodes n. The cut distance is given by

δ^(G,G)=minpP||APTAP||

ie, we look at all possible permutations of the nodes of G WRT the node labelings of G.

different node counts

Let Gn and Gm be graphs with n and m nodes respectively, where nm. Since these are not comparable on their own, we need to bring them into common frame of reference in order to calculate distances between them.

To do this, we can bring them into graphon space. So we need to define graphons induced by these graphs.

induced graphon

Let Gn be a graph with n nodes. The graphon induced by Gn is

Wn(u,v)=i=1nj=1nAij1(uIi)1(vIj)

where 1 is the indicator function and Ik is defined as

Ik={[k1n,kn)1kn1[n1n,1]k=n

see induced graphon

Example

20250324-induced-graphon.png

This “chops up” the axes into intervals of size 1n and assigns values based on A. We can think of this as a “heat map” of A.

We can define both Gn and Gm as kernels on [0,1]2 so that they are compatible objects. This means we need a notion of the cut norm for kernels.

cut norm (kernels)

Let W be a kernel in [0,1]2. Its cut norm is defined as

||W||=supS,T[0,1]|S×TW(u,v),du,dv|

see kernel cut norm

cut metric (kernels)

For two kernels (graphons), we define the cut metric

δ(W,W)=infϕ||WϕW||

Where Wϕ(u,v)=W(ϕ(u),ϕ(v)) and ϕ are measure-preserving bijections.

see kernel cut metric


#flashcards/math/dsg

Created 2025-03-24 Last Modified 2025-05-30