Data

subject:: Data Science Methods for Large Scale Graphs

2025-01-29 see Lecture 3.pdf

Today

Last Time

Recall that if we diagonalizable shift operator $S \in R^{n \times n}$ that we can write as $S = V Λ V^{*}$ and graph signal $x$ , then the graph fourier transform is given as $GFT (x) = \overset{x}{^} = V^{*} x$

Today

Graph convolutional filters, which have nice properties

Locality

Shift invariance

Permutation Invariance

Frequency domain representations

expressivity

1. Graph Signals and Graph Signal Processing

1.1 graph convolution

Graph Filter

A (linear) graph filter is defined as an operator $H : R^{n} \to R^{n}, x \to y = H S, H \in R^{n \times n}$ ie, a graph filter is a linear map from graph signals to graph signals.

This is a basic building block for processing graph signals

(see linear graph filter)

Question

What are potential issues with such filters?

Answer

Since $H$ is not parameterized by the graph or the shift operator, it does not incorporate the graph sparsity pattern.

if nodes $i, j$ are in different connected components but $H_{ij} \neq = 0$ , this is not consistent with the structure of the graph!

Because of this, we cannot implement $H$ locally at each node.

$(S x)_{i} = \sum_{j} S_{ij} x_{i} = \sum_{j \in N_{i}} S_{ij} x_{i}$ , but we cannot write $H$ analogously ie $(H x)_{i} \neq = \sum_{j \in N_{i}} h_{j} x_{j}$

The number of parameters is $n^{2}$ - this is bad for large $G$ .

We cannot adapt $H$ for different sized graphs. Once we design $H$ , it only works for graphs of size $n$ .

Solution: Linear shift-invariant / graph convolution

Linear Shift-Invariant /Convolutional Graph Filters

We define our convolutional graph filter (or linear shift-invariant) as follows: $y = H (S) x = \sum_{k = 0}^{K - 1} h_{k} S^{k} x, h_{0}, h_{1}, \dots, h_{K - 1} \in R$ Note that means $H (S)$ is a $(K - 1)$ th degree polynomial of $S$ with (filter) coefficients $h_{i}$

Properties

This filter is

local

shift equivariant

permutation invariant

(see graph convolution)

We show each of the properties above:

Theorem

graph convolution are local.

$k$ steps:

We can write locally in terms of

y &= \sum_{k=0}^{K-1}h_{k}S^k x \ &= \sum_{k=0}^{K-1}h_{k}S^{k-1}Sx, ;; \text{let } z_{1}= Sx\ &= \sum_{k=0}^{K-1}h_{k}S^{k-1}z_{1} + h_{0}x \ &= \sum_{k=1}^{K-1}h_{k} S^{k-2}Sz_{1} + h_{0}x, ;; \text{let } z_{2}=Sz_{1}\ &= \sum_{k=2}^{K-1}h_{k}S^{k-2}Sz_{2} + h_{0}x + h_{1}z_{1} \ &= \dots\ &= \sum_{k=0}^{K-1}h_{k}z_{k} ,, (\text{letting }z_{i}=S^ix) \end{aligned}$$

Visual representation of the stepwise local expression

(see convolutional graph filters are local)

Theorem

graph convolution are shift equivariant.

Let $y = H (S) x$ where $S$ is a shift operator. Suppose $x^{'} = S x$ . To show shift equivariance, we need to show $y^{'} = S y$ (ie the output is shifted in the same way the input was).

Proof

y’ = H(S)x’ &= \sum_{k=0}^{K-1}h_{k}S^{k} (Sx) \ &= \sum_{k=0}^{K-1}h_{k}S^{k+1}x \ &= S \sum_{k=0}^{K-1}h_{k}S^k x \ &= Sy \end{aligned}$$

(see convolutional graph filters are shift equivariant)

note: if $x$ is shifted/diffused, $y$ is shifted in the same way!

Theorem

graph convolution are permutation equivariant.

Suppose we have a graph shift operator $S$ , a graph signal $x$ , and a permutation matrix $P$ . (recall that $P \in {0, 1}$ such that $P 1 = 1$ , $P 1 = 1$ , $P^{T} P = P P^{T} = I$ ). Suppose we relabel nodes $V = {1, 2, \dots, n}$ according to $P$ , ie, define $S^{'} = PS P^{T}$ and $x^{'} = P x$ . Consider filter $y = H (S) x$ .

Then $y^{'} = H (S) x^{'} = P y$

Proof

y’ = H(S’)x’ &= \sum_{k=0}^{K-1}h_{k}(PSP^T)^k Px \ &= \sum_{k=0}^{K-1}h_{k} PS^k P^T Px \ &= P \sum_{k=0}^{K-1}h_{k}S^{k} x \ &= Py \end{aligned}$$

Thus $H$ is invariant to permutations. (if $S$ and $x$ are permuted/relabelled, $y$ is relabelled in the same way)

(see convolutional graph filters are permutation equivariant)

1.2 Spectral Representations of Convolutional Graph Filters

This is the fourier transform of a filter.

Given $S = V Λ V^{*}$ , recall that $\overset{x}{^} = GFT (x) = V^{*} x$ is the graph fourier transform and $x = I GFT (x) = V \overset{x}{^}$ (see inverse graph fourier transform)

The graph fourier transform of the filtered signal is given as

\hat{y} = V^* y &= V^* \sum_{k=0}^{K-1}h_{k}S^{k} x \\ &= V^* \sum_{k=0}^{K-1}h_{k}S^{k} (V \hat{x}) \\ &= \sum_{k=0}^{K-1}h_{k} V^* (V \Lambda V^*)^ V \hat{x} \\ &= \sum_{k=0}^{K-1}h_{k} \Lambda^k \hat{x} \end{aligned}$$ Thus, the [[Concept Wiki/graph fourier transform\|GFT]] of the filter and the operator has the same polynomial relationship, but in the spectral domain. see [[Concept Wiki/spectral representation of a convolutional graph filter]] >[!tip] Key Takeaway > in the spectral/frequency domain, we have $$\hat{y} = \sum_{k=0}^{K-1}h_{k} \Lambda^k \hat{x}$$ The spectral representation/frequency response of $H(S)$ is $$\widehat{H(S)} = \sum_{k=0}^{K-1}h_{k} \Lambda^k$$ >[!note] Observations > >1. *the filter is independent from the graph*. It is completely defined by the polynomial $h(\lambda) = \sum_{k=0}^{K-1}h_{k} \lambda^k$ in the spectral domain > - We can see from above that the filter frequency response only depends on the coefficients for the filter $h_{k}$ and the [[Concept Wiki/eigenvalue\|eigenvalues]] $\Lambda$ of the [[Concept Wiki/graph shift operator\|shift operator]]. > - If we consider another graph $\tilde{G}$ with [[Concept Wiki/graph shift operator\|shift operator]] $\tilde{S}$ and $\sigma(\tilde{S}) = \tilde{\Lambda}$. Then $H(\tilde{S}) = \sum_{k=0}^{K-1}h_{k} \tilde{\Lambda}^k$ > 2. the $i$th spectral component of output $y$ *only depends on $\lambda_{i}$ and $\hat{x}_{i}$* > - $\widehat{H(S)}$ is *pointwise* in the spectral domain. ie $\hat{y}_{i} = \sum_{k=0}^{K-1}h_{k}\lambda_{i}^k \hat{x}_{i}$. This follows immediately from the fact that $\Lambda$ is diagonal. > > >[!answer] Idea > >The polynomial is fixed once we find the coefficients $h_k$. We can think of $\widehat{H(S)}$ as sampling values $\lambda_{i}$ along the curve $h(\lambda) = \sum_{k=0}^{K-1}h_{k}\lambda^k$. Thus changing $S$ corresponds to a different set of points sampled from this polynomial. > > ![[Attachments/2025-01-29_graph1.jpg]] > > > > (see [[Concept Wiki/the spectral representation of a graph filter is independent from the graph]], [[Concept Wiki/the spectral graph filter operates on a signal pointwise]]) ### 1.2.1 Expressivity of convolutional graph filters >[!question] > >What kinds of functions can $H(S)$ represent (in the spectral domain)? > Suppose we want to design a filter with spectral response $f: \mathbb{R} \to \mathbb{R}, \lambda \to f(\lambda)$. Can this be implemented as a [[Concept Wiki/graph convolution\|graph filter]]? > >>[!answer] >>Yes, as long as $f$ is analytic - ie, a function with a convergent Taylor series. see [[Concept Wiki/we can represent any analytic function with convolutional graph filters]] >[!theorem] Fact > >Suppose $f$ is an [[Concept Wiki/analytic function]]. Then there exist coefficients $h_{i}$ such that >$$\begin{aligned}h(\lambda) &= \sum_{k=0}^{\infty}h_{k}\Lambda^k \\ >&= f(0) + \lambda f'(0) + \frac{\lambda^2}{2!}f''(0) + \frac{\lambda^3}{3!}f'''(0) + \dots \\ >&= f \end{aligned}$$ > >>[!proof] >>This follows immediately from the definition of an analytic function >[!example] > >Suppose we have a function $f(\lambda) = \exp \left\{ -\frac{(\lambda-a)^2}{b} \right\}$. Then $h(\lambda) = ?$ >The Taylor series for $h$ is given as > $$h(\lambda) = f(0) + \lambda f'(0) + \frac{\lambda^2}{2!}f''(0) + \frac{\lambda^3}{3!}f'''(0) + \dots$$ >Here, >- $f' = -\frac{2(\lambda-a)}{b} \exp\{- \frac{(\lambda-a)^2}{b}\} \implies f'(0) =$ >- $f'' = () \implies f''(0)=$ >- $f''' = \implies f'''(0)=$ > >Thus we can define $H(S) =\sum_{k=0}^{K-1}h_{k}\Lambda^k$ with our corresponding coefficients. >[!tip] Important > We can (and in practice, will) truncate the degree $K$. This corresponds to the order $K$ Taylor approximation of $f(\lambda)$. >[!question] > >What are the limitations of the above expressivity result? >>[!answer] >>- it is spectral. It does not apply to the graph domain. >>- This does not account for the specific [[Concept Wiki/graph]] / [[Concept Wiki/graph shift operator]] $S(G)$ or the specific [[Concept Wiki/graph signals\|graph signal]] $x$ In general, we want to answer >[!question] > > Can we use $H(S)$ to represent (or approximate) *any* signal/representation $y$ ? >[!theorem] > >Let $S$ be a [[Concept Wiki/graph shift operator\|shift operator]] with [[Concept Wiki/eigenvalue\|eigenvalues]] $\lambda_{1}, \lambda_{2}, \dots, \lambda_{n}$, and $x$ a [[Concept Wiki/graph signals\|graph signal]]. Let $\hat{x}$ be the [[Concept Wiki/graph fourier transform\|GFT]]. > >Suppose >- $\hat{x}_{i} \neq 0\;\;\;\forall i$ >- $\lambda_{i} \neq \lambda_{j}\;\;\forall i \neq j$ > > Then there exist $K \leq n$ coefficients $h_{0},\dots,h_{K-1}$ such that $y=H(S)x$. > >>[!proof]+ >>Recall from above that >>- $\hat{y} = h(\Lambda)\hat{x}$ >>- $\hat{y} = \sum_{k}^{K-1} h_{k}\Lambda^k \hat{x}$ >>- $\hat{y}_{i} = \sum_{k}^{K-1} h_{k} \lambda^k_{i} \hat{x}_{i}$ >> >> We can write this as a linear system: >> $$\begin{aligned} \begin{bmatrix} \hat{x}_{1} & \hat{x}_{1} \lambda_{1} & \hat{x}_{1}\lambda_{1}^2 & \dots & \hat{x}_{1}\lambda_{1}^{K-1} \\ \hat{x}_{2} & \hat{x}_{2}\lambda_{2} & \hat{x}_{2} \lambda_{2}^2 & \dots & \hat{x}_{2}\lambda_{2}^{K-1} \\ \vdots & \vdots & & & \vdots \\ \hat{x}_{n} & \hat{x}_{n}\lambda_{n} & \hat{x}_{n}\lambda_{n}^2 & \dots & \hat{x}_{n}\lambda_{n}^{K-1} \\ \end{bmatrix}\begin{bmatrix} h_{0} \\ h_{1} \\ \vdots \\ h_{K-1} \end{bmatrix}&= \\ \text{diag}(\hat{x}) \begin{bmatrix} 1 & \lambda_{1} & \lambda_{1}^2 & \dots & \lambda_{1}^{K-1} \\ 1 & \lambda_{2} & \lambda_{2}^2 & \dots & \lambda_{2}^{K-1} \\ \vdots & \vdots & & & \vdots \\ 1 & \lambda_{n} & \lambda_{n}^2 & \dots & \lambda_{n}^{K-1} \end{bmatrix}\begin{bmatrix} h_{0} \\ h_{1} \\ \vdots \\ h_{K-1} \end{bmatrix} &= \begin{bmatrix} \hat{y}_{1} \\ \hat{y}_{2} \\ \vdots \\ \hat{y}_{n} \end{bmatrix} \end{aligned}$$ >> Note that LHS $= \text{diag}(\hat{x})V$ where $V$ is a $n\times K$ Vandermonde matrix >> >> In order to find a solution to the system (ie, in order for us to find coefficients $h_{0}, \dots, h_{K-1}$), we need >> 1. $\text{diag}(\hat{x})$ invertible $\iff \hat{x}_{i} \neq 0\;\;\;\forall i$ >> 2. $Vh = \text{diag}(\hat{x})^{-1}\hat{y}$ to yield a solution >> >> In the simplest case where $K=n$, we need the Vandermonde matrix to have an inverse. Since $\det(V) = \prod_{i \neq j} (\lambda_{i} - \lambda_{j})$, $V$ is invertible if and only the $\lambda_{i}$ are distinct. >> >> If $K \neq n$ or arbitrary $K$, the Rouché-Capelli Theorem states that a linear system with $n$ equations in $K$ unknowns is consistent (ie has a solution) if and only if the rank of the coefficient matrix and the augmented matrix are the same. In this case, this means >> $$\begin{aligned} \text{rank}\left( >>\begin{bmatrix} 1 & \lambda_{1} & \lambda_{1}^2 & \dots & \lambda_{1}^{K-1} \\ 1 & \lambda_{2} & \lambda_{2}^2 & \dots & \lambda_{2}^{K-1} \\ \vdots & \vdots & & & \vdots \\ 1 & \lambda_{n} & \lambda_{n}^2 & \dots & \lambda_{n}^{K-1} \end{bmatrix}\right) &= \text{rank} \left( \left[\begin{array}{ccccc|c} 1 & \lambda_{1} & \lambda_{1}^2 & \dots & \lambda_{1}^{K-1} & \frac{\hat{y}_{1}}{\hat{x}_{1}} \\ 1 & \lambda_{2} & \lambda_{2}^2 & \dots & \lambda_{2}^{K-1} & \frac{\hat{y}_{2}}{\hat{x}_{2}}\\ \vdots & \vdots & & & \vdots & \vdots\\ 1 & \lambda_{n} & \lambda_{n}^2 & \dots & \lambda_{n}^{K-1} & \frac{\hat{y}_{n}}{\hat{x}_{n}} \end{array} \right]\right) \\ >> \impliedby \lambda_{i} \neq \lambda_{j} \;\;\; \forall i \neq j \end{aligned}$$ > >(see [[Concept Wiki/conditions for finding a convolutional graph filter]])

mnzk digital garden

Explorer

2025-01-29 graphs lecture 3

1. Graph Signals and Graph Signal Processing

1.1 graph convolution

1.2 Spectral Representations of Convolutional Graph Filters

Graph View

Table of Contents

Backlinks