[[lecture-data]] Summary
- homework 2 download β 2025-03-24 π 2025-03-24 class_notes
- pick final project class_notes π 2025-03-26 β 2025-03-26
Homework 2 notes
- val acc not stable for expressivity : normal since small val set. Loss smooth
- typical for graph-level problem
- stability: 10-30% goal to see stable degrades for different models
- gnn slower than for filters
- more layers = more stable for gnn (more nonlinearities)
- discuss degradation
- Last prob
- penalize vs not penalized
- plot model 2β vs mod 2
Announcement on final project
- 1-2 ppl
- either novel graph data/train model : 10min presentation
- research paper presentation on any modern GNN/graph DS (past 5 years): 20 min presentation
- review/discussion
Limits of Graphs
Last time, we looked at different graph perturbations and how to design stable GNNs. We saw how GNNs inherit stability from their layers and that GNNs perform better than their constituent filters because of the scattering of spectral information by the nonlinear activation function.
Question
- What if the graph grows?
- What if the graph is too large and I donβt have enough resources to train on it? (recall GNN forward pass is where is the number of layers and is the order of the filter polynomial)
To address problem 2 (what if the graph is too large?), we can subsample the graph with a random graph model. We will talk about this later.
Today, we turn to problem 1 (what if the graph grows?). To address this, we want to know
Question
- Can we measure how βcloseβ two large graphs are? ie, can we define a metric for two (large) graphs?
- If we can, then we can see if a GNN is continuous. If we can prove the GNN is continuous, then we are OK (perturbations will affect the function in a predictable way). Then,
- Can we prove the GNN is continuous WRT to this metric?
What does it mean for two graphs to be βcloseβ? Weβll say this happens when they converge to a common limit.
defining a metric between graphs
What does it mean for graphs to be βcloseβ?
- From a probability POV, it makes sense to say that graphs are βcloseβ if {sampled subgraphs have similar distributions}.
- Equivalently, from a statistics POV, we say graphs are βcloseβ if they {have similar subgraph counts (triangle counts, degrees, etc)}
To measure the subgraph counts, we can use homomorphism density. Recall from lecture 11 the definition of a graph homomorphism
Graph Homomorphism
Let and . A graph homomorphism from to is a map such that (i,j) \in E' \implies (\gamma(i), \gamma(j)) \in E'$$ ie, homomorphisms are *adjacency-preserving maps* (of FFG$.)
Example
There are a few homomorphisms from to :
- And trivially, every permutation of is also a homomorphism here
There are also:
- and for each of the above, each permutation of as well.
The homomorphism count in this case.
Recall also the definition for the homomorphism density:
homomorphism density
Let and be graphs. The homomorphism density from to , denoted is given as t(F,G) = \frac{\text{hom}(F,G)}{\lvert V \rvert^{\lvert V' \rvert } }$$ Where \text{hom}(F,G)FG$.
We can think of this as the probability of sampling a version of from the graph .
Example
in the example above, This equates to the probability of sampling a triangle from when we sample 3 vertices.
The astute reader will see that we can use the homomorphism density to compare graphs from the statistics POV of sampling. To do this, we need some subgraphs that serve as references for the types of counts we want (eg. triangles).
motif
Let be a graph. is a motif if it is
- undirected
- unweighted
- with 1 edge per node pair
- not loopy (no self edges)
(see graph motif)
We can use this to create an idea of βclosenessβ for two graphs . It makes sense that and are βcloseβ if for all motifs , we have .
To formalize this idea into a definition, we first need to define what a convergent graph sequence is.
Convergent graph sequences and graphons
More info from LevΓ sz, Chayes, Borgs, Vesztergambi starting 2008-on
To begin, we need to define the βlimitβ of a sequence of graphs as the number of nodes grows.
We call the limit of a sequence of graphs of increasing node count a {graphon}
graphon
(see graphon)
NOTE
A more general definition is where is a sample space with some probability measure . We can always map onto using a measure-preserving map (as long as the CDF of is strictly monotone), so we can use WLOG.
Exercise
Show that if the CDF of is strictly monotone, we can map to the interval with a measure-preserving map.
Note
The codomain of may be where . However, we usually have since often represents a probability.
[!example]
From left to right:
- block model with same community sizes
- different community sizes, and
If the limit of the graph sequence is a graphon, then the limit of the graph homomorphism density sequence must be the {==graphon homomorphism density.==}
graphon homomorphism density
The density from a graph motif to a graphon is This is the generalization of the definition of the homomorphism density
If for all , then can be interpreted as {==the probability of sampling the motif from the graphon .==}
Example
Erdos-Renyi with
(see graphon homomorphism density)
is a convergent graph sequence with limit .
Sequence of Convergent Graphs
(see convergent graph sequence)
- This is more of a βlocalβ idea of convergence since it checks to see if sampled subgraphs converge up to the limiting object. This is called left convergence since it deals with left homomorphism densities and .
- This is not the only way to identify a (dense) convergent graph sequence. Another definition is based on the convergence of min-cuts or right homomorphisms and .
- This is a more βglobalβ notion of convergence that is used sometimes by graph theorists or in physics (micro-canonical ground state energy)
- Fore dense graphs, left and right convergence are equivalent (for the metric we like) without proof.
- Dense graphs are ββconvexββ
- add definition for motif clean β 2025-03-24 π 2025-03-24 β 2025-03-24
metric on graphs
While gives us a way to define convergent sequences of graphs, we canβt use it to measure a distance between graphs without computing and for all .
We want to use a metric that is easier to calculate. Eventually, we want a distance for arbitrary graphs and which may have different numbers of nodes. However, we begin by building our definition from the simplest case where the node counts and labels are the same.
same node count and labels
Let and be graphs with the same and the same node sets (ie, same node labelling).
norm for graphs
norm or βedit distanceβ where are the unweighted adjacency matrices of respectively.
Cut Norm
Let be a matrix. The cut norm is given as
see cut norm
cut distance (same node count, same labels)
The cut distance will be our metric of choice in most cases.
same node count, different labels
cut distance (same node count, potentially different labels)
Let and have the same number of nodes . The cut distance is given by ie, we look at all possible permutations of the nodes of WRT the node labelings of .
different node counts
Let and be graphs with and nodes respectively, where . Since these are not comparable on their own, we need to bring them into common frame of reference in order to calculate distances between them.
To do this, we can bring them into graphon space. So we need to define graphons induced by these graphs.
induced graphon
Let be a graph with nodes. The graphon induced by is where is the indicator function and is defined as
[ \frac{k-1}{n}, \frac{k}{n} )\;\;\;1 \leq k \leq n-1 \\ \left[ \frac{n-1}{n}, 1 \right] \;\;\; k=n \end{cases}$$
see induced graphon
Example
This βchops upβ the axes into intervals of size and assigns values based on . We can think of this as a βheat mapβ of .
We can define both and as kernels on so that they are compatible objects. This means we need a notion of the cut norm for kernels.
cut norm (kernels)
Let be a kernel in . Its cut norm is defined as
see kernel cut norm
cut metric (kernels)
For two kernels (graphons), we define the cut metric Where and are measure-preserving bijections.
const { dateTime } = await cJS()
return function View() {
const file = dc.useCurrentFile();
return <p class="dv-modified">Created {dateTime.getCreated(file)} Φ Last Modified {dateTime.getLastMod(file)}</p>
}
There are a few 
From left to right:
