Random Matrix Lecture 21

[[lecture-data]]

Info

let current = dv.current();
let previous = current.previous_lecture;
 
// get notes with this as the previous
let pages = dv.pages("")
	.where(p => p.previous_lecture)
	.where(p => String(p.previous_lecture)
		.includes(dv.current().file.link));
 
let output = `:LiArrowBigLeftDash: ${previous || "No Previous Lecture"} | `;
 
if (pages.length > 0) {
	output += pages
		.map(p => p.file.link).join(", ");
} else {
	output += 'No Next Lecture'
}
 
output += ' :LiArrowBigRightDash:'
 
dv.el('div', output)
 

Class Notes

Summary

  • Subgaussian concentration inequalities for random matrices
    • McDiarmid Inequality
    • log-Sobolev inequalities
    • Gaussian-Lipschitz concentation
  • Applications of Gaussian-Lipschitz

7. Non-Asymptotic RMT

Last time, we saw

Cor symmetric with independent entries in , then What if we don’t assume

  • iid entries
  • centered () ?

Example

Any adjacency matrix of an (unweighted) random graph with independent edges (Erdos-Renyi, SBM)

Remark

  • Bounded Differences implies
  • Efron-Stein theorem implies

Question What about, for example, when ?

Pointcare Inequality

Let be a probability measure on . We say that satisfies the Pointcare Inequality, denoted , if for all differentiable ,

Notes

  • is an expression of the function’s sensitivity to each coordinate
  • Sometimes, is the “other” notion of ""

see pointcare inequality

Theorem

If has pointcare inequality and is Lipschitz, then

Proof

Exercise

Gaussian Pointcare Inquality Theorem (GIP)

has pointcare inequality where (independent of !)

See Gaussian Pointcare Inequality Theorem

Corollary

Proof

We can view as and construct where

\sqrt{ 2 }g_{1,1} & \dots & & \\ \vdots & \ddots & g_{i,j} & & \\ & g_{i,j} & \ddots \\ & & & \sqrt{ 2 }g_{\frac{d(d+1)}{2}} \end{bmatrix}$$ Then define $f(g)=\lambda_{1}(X(g))$. This is Lipschitz with $L=\sqrt{ 2 }$ and we have $$\begin{align} \text{Var}_{X \sim GOE(GPI)}(\lambda_{1}(X)) &= \text{Var}_{g \sim {\cal N}(0, I_{\underbrace{ N }_{ d(d+1)/2 }})}(f(g)) \\ \lvert f(g)-f(h) \rvert &= \lvert \lambda_{1}(X(g)) \rvert -\lvert \lambda_{1}(X(h)) \rvert \\ &\leq \lvert \lvert X(g) - X(h) \rvert \rvert _{\text{op}} \\ &\leq \lvert \lvert X(g) - X(h) \rvert \rvert _{F} \\ &\leq \sqrt{ 2 } \lvert \lvert g-h \rvert \rvert \end{align}$$ $$\tag*{$\blacksquare$}$$

Corollary

Lemma ("Tensorization" of PI)

Suppose are probability measures over each with PI . Then the product measure Also has PI .

Proof

Using the Efron-Stein theorem, we see

\text{Var}_{X\sim \mu}(f(x)) &\leq \sum_{i=1}^N \mathbb{E}_{\text{all except }i}\left[ \overbrace{\text{Var}_{X_{i}\sim \mu_{i}}(f(x)) }^{PI \text{ of }\mu_{i}}\right] \\ &\leq \sum_{i=1}^N \mathbb{E}_{\text{all except }i}\left[C\mathbb{E}_{X_{i}\sim \mu_{i}}\left[(\partial_{i} f(x))^2\right] \right] \\ &\leq C \mathbb{E}_{}\left[\lvert \lvert \nabla f(x) \rvert \rvert ^2\right] \end{align}$$

see tensorization of PI

Proof of GPI

Proof

By the tensorization lemma, it suffices to do . We will use the central limit theorem and Efron-Stein.

Introduce . Then we have

\text{Var}_{g\sim {\cal N}(0, 1)}(f(g)) &\overset{\text{(CLT)}}\approx \text{Var}\left( f\left( \frac{1}{\sqrt{ k }}\sum_{i=1}^k z_{k} \right) \right) \\ &\overset{(ES)}\leq \frac{1}{2} \sum_{i=1}^k \mathbb{E}_{}\left[\left( f\left( \underbrace{ \frac{1}{\sqrt{ k }}\sum_{j=1}^k z_{j} }_{ \frac{1}{\sqrt{ k }}\sum_{j\neq i}z_{j} +\frac{z_{i}}{\sqrt{ k }} }\right) -f\left( \underbrace{ \frac{1}{\sqrt{ k }} \sum_{j\neq i}z_{j} }_{ S_{i} } + \frac{z_{i}}{\sqrt{ k }} \right) \right)^2\right] \\ &= \frac{1}{4} \sum_{i=1}^k \mathbb{E}_{}\left[\left( f\left( S_{i}+\frac{1}{\sqrt{ k }} \right)-f\left( S_{i}-\frac{1}{\sqrt{ k }} \right) \right)^2\right] \\ &\approx \frac{1}{4} \sum_{i=1}^k \mathbb{E}_{}\left[\left( f'(S_{i}) \frac{2}{\sqrt{ k }} \right)^2\right] \\ &= \mathbb{E}_{}\left[(f'(S_{1}))^2\right] \\ &\approx \mathbb{E}_{g \sim {\cal N}(0,1)}\left[(f'(g))^2\right] \end{align}$$

Example

where . Then and is Lipschitz. Thus

Subgaussian Concentration

Our bounded difference inequality gives us

Theorem McDiarmid / Azumai-Hoefding is -subgaussian And from homework 1, we have

Proof Like Efron-Stein (to get something that looks like ) using a Doob martingale to get something from Hoefding.

Fact (important) Theorem

Theorem (important!)

(Generalization of GPI for Subgaussians)

If and is Lipschitz, then is subgaussian, ie

\mathbb{P}[\,\lvert f(x)-\mathbb{E}_{}\left[f(x)\right] \rvert \geq t] &\leq 2\exp\left( -\frac{t^2}{2L^2} \right) \\ \iff f(x)&= \mathbb{E}_{}\left[f(x)\right] \pm {\cal O}(L)\quad \text{w.h.p.} \end{align}$$

See Gaussian Lipschitz concentration inequality HD Stat Theorem 2.26

"Very Gaussian" proof

Exercise

(homework 4)

Corollaries If and , then

  • is 1-subgaussian
  • is 2-subgaussian
  • is subgaussian
  • is 2-subgaussian
  • If , then with high probability, is 1-subgaussian (compare to the proof for Johnson-Lindenstrauss)

Log-Sobolev Inequality

For ,

\text{Ent}[X] &= \mathbb{E}_{}\left[X\log X\right] -\mathbb{E}_{}\left[X\right] \log(\mathbb{E}_{}\left[X\right] ) \\ & \mathbb{E}_{}\left[\phi(X)\right] - \phi(\mathbb{E}_{}\left[X\right] ) \end{align}$$ We can think of this as a "generalized variance" Lemma 1 [[Efron-Stein theorem|Efron-Stein]] for Entropy $$\text{Ent}[f(x)] \leq \sum_{i=1}^N \mathbb{E}_{\text{all except }i}\left[\text{Ent}[f(x)]\right] $$ Lemma (Herbst argument) Suppose for all $\lambda$, $\frac{\text{Ent}\left[ {e^{\lambda X}} \right]}{\mathbb{E}_{}\left[e^{\lambda X}\right]} \leq \frac{\sigma^2}{2}\lambda^2$. Then $X$ is $\sigma^2$ [[subgaussian]]. $\frac{\text{Ent}\left[ {e^{\lambda X}} \right]}{\mathbb{E}_{}\left[e^{\lambda X}\right]}$ is like $\psi_{X}(\lambda)$ but with tensorization! Where $\psi_{x}(\lambda)=\log(\mathbb{E}_{}\left[e^{\lambda x}\right])$ is the moment generating function. Def Modified Log Sobolev Inequality Suppose $\mu$ satisfies **MLSI**($C$). If $$\text{Ent}(e^{f(x)}) \leq C\,\mathbb{E}_{x \sim \mu}\left[\lvert \lvert \nabla f(x) \rvert \rvert^2\right] e^{f(x)}$$ see [[MLSI]] Lemma MLSI tensorizes (like [[pointcare inequality|PI]]) Theorem If $\mu$ has [[MLSI]]$(C)$ and $f$ is $L$ [[Lipschitz continuous|Lipschitz]], and $x \sim \mu$, then $f(x)$ is $2CL^\xi$ [[subgaussian]] Theorem ${\cal N}(0, 1)$ has [[MLSI]]$\left( \frac{1}{2} \right)$ ```datacorejsx const { dateTime } = await cJS() return function View() { const file = dc.useCurrentFile(); return <p class="dv-modified">Created {dateTime.getCreated(file)} ֍ Last Modified {dateTime.getLastMod(file)}</p> } ```