graph SAGE

[[concept]]

Graph SAGE

Introduced by Hamilton-Ying-Leskovee in 2017, each Graph SAGE layer implements the following 2 operations:

Aggregate

(U_{ℓ})_{i} = {AGGREGATE}_{ℓ} ({(x_{ℓ})_{j}, j \in N (i)})

Concatenate

(x_{ℓ})_{i} = σ (CONCAT ((x_{ℓ - 1})_{i}, (U_{ℓ})_{i}))

The standard $AGGREGATE$ operation is an average over $N (i)$ . Letting $H = [H_{0} H_{1}]^{T}$ , we get

\begin{aligned} (U_{ℓ})_{i} & = \frac{1}{| N (i) |} \sum_{j \in N (i)} (x_{ℓ - 1})_{j} \\ ⟹ U_{ℓ} & = S X_{ℓ - 1}, S = D^{- 1 / 2} A D^{- 1 / 2} \\ x_{ℓ} = σ ([x_{ℓ - 1} U_{ℓ}] H) & = σ (x_{ℓ - 1} H_{0} + S X_{ℓ - 1} H_{1}) \end{aligned}

where $A$ is the binary adjacency matrix, which is equivalent to a GCN with nodewise normalization $\frac{(x_{ℓ})_{i}}{| | (x_{ℓ})_{i} | |_{2}}$ . This normalization helps in some cases (empirically).

notes

SAGE implementations allow for a variety of $AGGREGATE$ functions, including $max$ and LSTMs (which look at $N (i)$ as a sequence, losing permutation equivariance). In these cases, SAGE is no longer a graph convolution.
- There are both advantages and disadvantages of staying a GCN and using other functions.
the authors of SAGE popularized the $AGGREGATE = UPDATE$ representation of GNNs (with $K = 2$ ) meaning

x_{ℓ} = σ (S X_{ℓ - 1} H_{ℓ})

$S x_{ℓ}$ is the "aggregate" and $σ (\dots H_{ℓ})$ is the "update"

Mentions

File
the WL test is at least as powerful as a GNN for detecting graph non-isomorphism
2025-02-17 graphs lecture 8
2025-02-24 graphs lecture 10
2025-04-16 lecture 21
2025-02-25 equivariant lecture 4