Data
2025-01-29 see Lecture 3.pdf
Today
- Recall that if we diagonalizable shift operator that we can write as and graph signal , then the graph fourier transform is given as
Today
- Graph convolutional filters, which have nice properties
- Locality
- Shift invariance
- Permutation Invariance
- Frequency domain representations
- expressivity
1. Graph Signals and Graph Signal Processing
1.1 graph convolution
Graph Filter
A (linear) graph filter is defined as an operator ie, a graph filter is a linear map from graph signals to graph signals.
This is a basic building block for processing graph signals
(see linear graph filter)
Question
What are potential issues with such filters?
Answer
- Since is not parameterized by the graph or the shift operator, it does not incorporate the graph sparsity pattern.
- if nodes are in different connected components but , this is not consistent with the structure of the graph!
- Because of this, we cannot implement locally at each node.
- , but we cannot write analogously ie
- The number of parameters is - this is bad for large .
- We cannot adapt for different sized graphs. Once we design , it only works for graphs of size .
Solution: Linear shift-invariant / graph convolution
Linear Shift-Invariant /Convolutional Graph Filters
We define our convolutional graph filter (or linear shift-invariant) as follows: Note that means is a th degree polynomial of with (filter) coefficients
Properties
This filter is
- local
- shift equivariant
- permutation invariant
(see graph convolution)
We show each of the properties above:
Theorem
graph convolution are local.
steps:
We can write locally in terms of
y &= \sum_{k=0}^{K-1}h_{k}S^k x \ &= \sum_{k=0}^{K-1}h_{k}S^{k-1}Sx, ;; \text{let } z_{1}= Sx\ &= \sum_{k=0}^{K-1}h_{k}S^{k-1}z_{1} + h_{0}x \ &= \sum_{k=1}^{K-1}h_{k} S^{k-2}Sz_{1} + h_{0}x, ;; \text{let } z_{2}=Sz_{1}\ &= \sum_{k=2}^{K-1}h_{k}S^{k-2}Sz_{2} + h_{0}x + h_{1}z_{1} \ &= \dots\ &= \sum_{k=0}^{K-1}h_{k}z_{k} ,, (\text{letting }z_{i}=S^ix) \end{aligned}$$
Visual representation of the stepwise local expression
Theorem
graph convolution are shift equivariant.
Let where is a shift operator. Suppose . To show shift equivariance, we need to show (ie the output is shifted in the same way the input was).
Proof
y’ = H(S)x’ &= \sum_{k=0}^{K-1}h_{k}S^{k} (Sx) \ &= \sum_{k=0}^{K-1}h_{k}S^{k+1}x \ &= S \sum_{k=0}^{K-1}h_{k}S^k x \ &= Sy \end{aligned}$$
note: if is shifted/diffused, is shifted in the same way!
Theorem
graph convolution are permutation equivariant.
Suppose we have a graph shift operator , a graph signal , and a permutation matrix . (recall that such that , , ). Suppose we relabel nodes according to , ie, define and . Consider filter .
Then
Proof
y’ = H(S’)x’ &= \sum_{k=0}^{K-1}h_{k}(PSP^T)^k Px \ &= \sum_{k=0}^{K-1}h_{k} PS^k P^T Px \ &= P \sum_{k=0}^{K-1}h_{k}S^{k} x \ &= Py \end{aligned}$$
Thus is invariant to permutations. (if and are permuted/relabelled, is relabelled in the same way)
(see convolutional graph filters are permutation equivariant)
1.2 Spectral Representations of Convolutional Graph Filters
This is the fourier transform of a filter.
Given , recall that is the graph fourier transform and (see inverse graph fourier transform)
The graph fourier transform of the filtered signal is given as
\hat{y} = V^* y &= V^* \sum_{k=0}^{K-1}h_{k}S^{k} x \\ &= V^* \sum_{k=0}^{K-1}h_{k}S^{k} (V \hat{x}) \\ &= \sum_{k=0}^{K-1}h_{k} V^* (V \Lambda V^*)^ V \hat{x} \\ &= \sum_{k=0}^{K-1}h_{k} \Lambda^k \hat{x} \end{aligned}$$ Thus, the [[Concept Wiki/graph fourier transform\|GFT]] of the filter and the operator has the same polynomial relationship, but in the spectral domain. see [[Concept Wiki/spectral representation of a convolutional graph filter]] >[!tip] Key Takeaway > in the spectral/frequency domain, we have $$\hat{y} = \sum_{k=0}^{K-1}h_{k} \Lambda^k \hat{x}$$ The spectral representation/frequency response of $H(S)$ is $$\widehat{H(S)} = \sum_{k=0}^{K-1}h_{k} \Lambda^k$$ >[!note] Observations > >1. *the filter is independent from the graph*. It is completely defined by the polynomial $h(\lambda) = \sum_{k=0}^{K-1}h_{k} \lambda^k$ in the spectral domain > - We can see from above that the filter frequency response only depends on the coefficients for the filter $h_{k}$ and the [[Concept Wiki/eigenvalue\|eigenvalues]] $\Lambda$ of the [[Concept Wiki/graph shift operator\|shift operator]]. > - If we consider another graph $\tilde{G}$ with [[Concept Wiki/graph shift operator\|shift operator]] $\tilde{S}$ and $\sigma(\tilde{S}) = \tilde{\Lambda}$. Then $H(\tilde{S}) = \sum_{k=0}^{K-1}h_{k} \tilde{\Lambda}^k$ > 2. the $i$th spectral component of output $y$ *only depends on $\lambda_{i}$ and $\hat{x}_{i}$* > - $\widehat{H(S)}$ is *pointwise* in the spectral domain. ie $\hat{y}_{i} = \sum_{k=0}^{K-1}h_{k}\lambda_{i}^k \hat{x}_{i}$. This follows immediately from the fact that $\Lambda$ is diagonal. > > >[!answer] Idea > >The polynomial is fixed once we find the coefficients $h_k$. We can think of $\widehat{H(S)}$ as sampling values $\lambda_{i}$ along the curve $h(\lambda) = \sum_{k=0}^{K-1}h_{k}\lambda^k$. Thus changing $S$ corresponds to a different set of points sampled from this polynomial. > > ![[Attachments/2025-01-29_graph1.jpg]] > > > > (see [[Concept Wiki/the spectral representation of a graph filter is independent from the graph]], [[Concept Wiki/the spectral graph filter operates on a signal pointwise]]) ### 1.2.1 Expressivity of convolutional graph filters >[!question] > >What kinds of functions can $H(S)$ represent (in the spectral domain)? > Suppose we want to design a filter with spectral response $f: \mathbb{R} \to \mathbb{R}, \lambda \to f(\lambda)$. Can this be implemented as a [[Concept Wiki/graph convolution\|graph filter]]? > >>[!answer] >>Yes, as long as $f$ is analytic - ie, a function with a convergent Taylor series. see [[Concept Wiki/we can represent any analytic function with convolutional graph filters]] >[!theorem] Fact > >Suppose $f$ is an [[Concept Wiki/analytic function]]. Then there exist coefficients $h_{i}$ such that >$$\begin{aligned}h(\lambda) &= \sum_{k=0}^{\infty}h_{k}\Lambda^k \\ >&= f(0) + \lambda f'(0) + \frac{\lambda^2}{2!}f''(0) + \frac{\lambda^3}{3!}f'''(0) + \dots \\ >&= f \end{aligned}$$ > >>[!proof] >>This follows immediately from the definition of an analytic function >[!example] > >Suppose we have a function $f(\lambda) = \exp \left\{ -\frac{(\lambda-a)^2}{b} \right\}$. Then $h(\lambda) = ?$ >The Taylor series for $h$ is given as > $$h(\lambda) = f(0) + \lambda f'(0) + \frac{\lambda^2}{2!}f''(0) + \frac{\lambda^3}{3!}f'''(0) + \dots$$ >Here, >- $f' = -\frac{2(\lambda-a)}{b} \exp\{- \frac{(\lambda-a)^2}{b}\} \implies f'(0) =$ >- $f'' = () \implies f''(0)=$ >- $f''' = \implies f'''(0)=$ > >Thus we can define $H(S) =\sum_{k=0}^{K-1}h_{k}\Lambda^k$ with our corresponding coefficients. >[!tip] Important > We can (and in practice, will) truncate the degree $K$. This corresponds to the order $K$ Taylor approximation of $f(\lambda)$. >[!question] > >What are the limitations of the above expressivity result? >>[!answer] >>- it is spectral. It does not apply to the graph domain. >>- This does not account for the specific [[Concept Wiki/graph]] / [[Concept Wiki/graph shift operator]] $S(G)$ or the specific [[Concept Wiki/graph signals\|graph signal]] $x$ In general, we want to answer >[!question] > > Can we use $H(S)$ to represent (or approximate) *any* signal/representation $y$ ? >[!theorem] > >Let $S$ be a [[Concept Wiki/graph shift operator\|shift operator]] with [[Concept Wiki/eigenvalue\|eigenvalues]] $\lambda_{1}, \lambda_{2}, \dots, \lambda_{n}$, and $x$ a [[Concept Wiki/graph signals\|graph signal]]. Let $\hat{x}$ be the [[Concept Wiki/graph fourier transform\|GFT]]. > >Suppose >- $\hat{x}_{i} \neq 0\;\;\;\forall i$ >- $\lambda_{i} \neq \lambda_{j}\;\;\forall i \neq j$ > > Then there exist $K \leq n$ coefficients $h_{0},\dots,h_{K-1}$ such that $y=H(S)x$. > >>[!proof]+ >>Recall from above that >>- $\hat{y} = h(\Lambda)\hat{x}$ >>- $\hat{y} = \sum_{k}^{K-1} h_{k}\Lambda^k \hat{x}$ >>- $\hat{y}_{i} = \sum_{k}^{K-1} h_{k} \lambda^k_{i} \hat{x}_{i}$ >> >> We can write this as a linear system: >> $$\begin{aligned} \begin{bmatrix} \hat{x}_{1} & \hat{x}_{1} \lambda_{1} & \hat{x}_{1}\lambda_{1}^2 & \dots & \hat{x}_{1}\lambda_{1}^{K-1} \\ \hat{x}_{2} & \hat{x}_{2}\lambda_{2} & \hat{x}_{2} \lambda_{2}^2 & \dots & \hat{x}_{2}\lambda_{2}^{K-1} \\ \vdots & \vdots & & & \vdots \\ \hat{x}_{n} & \hat{x}_{n}\lambda_{n} & \hat{x}_{n}\lambda_{n}^2 & \dots & \hat{x}_{n}\lambda_{n}^{K-1} \\ \end{bmatrix}\begin{bmatrix} h_{0} \\ h_{1} \\ \vdots \\ h_{K-1} \end{bmatrix}&= \\ \text{diag}(\hat{x}) \begin{bmatrix} 1 & \lambda_{1} & \lambda_{1}^2 & \dots & \lambda_{1}^{K-1} \\ 1 & \lambda_{2} & \lambda_{2}^2 & \dots & \lambda_{2}^{K-1} \\ \vdots & \vdots & & & \vdots \\ 1 & \lambda_{n} & \lambda_{n}^2 & \dots & \lambda_{n}^{K-1} \end{bmatrix}\begin{bmatrix} h_{0} \\ h_{1} \\ \vdots \\ h_{K-1} \end{bmatrix} &= \begin{bmatrix} \hat{y}_{1} \\ \hat{y}_{2} \\ \vdots \\ \hat{y}_{n} \end{bmatrix} \end{aligned}$$ >> Note that LHS $= \text{diag}(\hat{x})V$ where $V$ is a $n\times K$ Vandermonde matrix >> >> In order to find a solution to the system (ie, in order for us to find coefficients $h_{0}, \dots, h_{K-1}$), we need >> 1. $\text{diag}(\hat{x})$ invertible $\iff \hat{x}_{i} \neq 0\;\;\;\forall i$ >> 2. $Vh = \text{diag}(\hat{x})^{-1}\hat{y}$ to yield a solution >> >> In the simplest case where $K=n$, we need the Vandermonde matrix to have an inverse. Since $\det(V) = \prod_{i \neq j} (\lambda_{i} - \lambda_{j})$, $V$ is invertible if and only the $\lambda_{i}$ are distinct. >> >> If $K \neq n$ or arbitrary $K$, the Rouché-Capelli Theorem states that a linear system with $n$ equations in $K$ unknowns is consistent (ie has a solution) if and only if the rank of the coefficient matrix and the augmented matrix are the same. In this case, this means >> $$\begin{aligned} \text{rank}\left( >>\begin{bmatrix} 1 & \lambda_{1} & \lambda_{1}^2 & \dots & \lambda_{1}^{K-1} \\ 1 & \lambda_{2} & \lambda_{2}^2 & \dots & \lambda_{2}^{K-1} \\ \vdots & \vdots & & & \vdots \\ 1 & \lambda_{n} & \lambda_{n}^2 & \dots & \lambda_{n}^{K-1} \end{bmatrix}\right) &= \text{rank} \left( \left[\begin{array}{ccccc|c} 1 & \lambda_{1} & \lambda_{1}^2 & \dots & \lambda_{1}^{K-1} & \frac{\hat{y}_{1}}{\hat{x}_{1}} \\ 1 & \lambda_{2} & \lambda_{2}^2 & \dots & \lambda_{2}^{K-1} & \frac{\hat{y}_{2}}{\hat{x}_{2}}\\ \vdots & \vdots & & & \vdots & \vdots\\ 1 & \lambda_{n} & \lambda_{n}^2 & \dots & \lambda_{n}^{K-1} & \frac{\hat{y}_{n}}{\hat{x}_{n}} \end{array} \right]\right) \\ >> \impliedby \lambda_{i} \neq \lambda_{j} \;\;\; \forall i \neq j \end{aligned}$$ > >(see [[Concept Wiki/conditions for finding a convolutional graph filter]])
Visual representation of the stepwise local expression