[[lecture-data]]Info
let current = dv.current(); let previous = current.previous_lecture; // get notes with this as the previous let pages = dv.pages("") .where(p => p.previous_lecture) .where(p => String(p.previous_lecture) .includes(dv.current().file.link)); let output = `:LiArrowBigLeftDash: ${previous || "No Previous Lecture"} | `; if (pages.length > 0) { output += pages .map(p => p.file.link).join(", "); } else { output += 'No Next Lecture' } output += ' :LiArrowBigRightDash:' dv.el('div', output)
Class Notes
Summary
- Subgaussian concentration inequalities for random matrices
- McDiarmid Inequality
- log-Sobolev inequalities
- Gaussian-Lipschitz concentation
- Applications of Gaussian-Lipschitz
7. Non-Asymptotic RMT
Last time, we saw
Cor symmetric with independent entries in , then What if we don’t assume
- iid entries
- centered () ?
Example
Any adjacency matrix of an (unweighted) random graph with independent edges (Erdos-Renyi, SBM)
Remark
- Bounded Differences implies
- Efron-Stein theorem implies
Question What about, for example, when ?
Pointcare Inequality
Let be a probability measure on . We say that satisfies the Pointcare Inequality, denoted , if for all differentiable ,
Notes
- is an expression of the function’s sensitivity to each coordinate
- Sometimes, is the “other” notion of ""
Theorem
If has pointcare inequality and is Lipschitz, then
Proof
Exercise
Gaussian Pointcare Inquality Theorem (GIP)
has pointcare inequality where (independent of !)
See Gaussian Pointcare Inequality Theorem
Corollary
Proof
We can view as and construct where
\sqrt{ 2 }g_{1,1} & \dots & & \\ \vdots & \ddots & g_{i,j} & & \\ & g_{i,j} & \ddots \\ & & & \sqrt{ 2 }g_{\frac{d(d+1)}{2}} \end{bmatrix}$$ Then define $f(g)=\lambda_{1}(X(g))$. This is Lipschitz with $L=\sqrt{ 2 }$ and we have $$\begin{align} \text{Var}_{X \sim GOE(GPI)}(\lambda_{1}(X)) &= \text{Var}_{g \sim {\cal N}(0, I_{\underbrace{ N }_{ d(d+1)/2 }})}(f(g)) \\ \lvert f(g)-f(h) \rvert &= \lvert \lambda_{1}(X(g)) \rvert -\lvert \lambda_{1}(X(h)) \rvert \\ &\leq \lvert \lvert X(g) - X(h) \rvert \rvert _{\text{op}} \\ &\leq \lvert \lvert X(g) - X(h) \rvert \rvert _{F} \\ &\leq \sqrt{ 2 } \lvert \lvert g-h \rvert \rvert \end{align}$$ $$\tag*{$\blacksquare$}$$
Corollary
Lemma ("Tensorization" of PI)
Suppose are probability measures over each with PI . Then the product measure Also has PI .
Proof
Using the Efron-Stein theorem, we see
\text{Var}_{X\sim \mu}(f(x)) &\leq \sum_{i=1}^N \mathbb{E}_{\text{all except }i}\left[ \overbrace{\text{Var}_{X_{i}\sim \mu_{i}}(f(x)) }^{PI \text{ of }\mu_{i}}\right] \\ &\leq \sum_{i=1}^N \mathbb{E}_{\text{all except }i}\left[C\mathbb{E}_{X_{i}\sim \mu_{i}}\left[(\partial_{i} f(x))^2\right] \right] \\ &\leq C \mathbb{E}_{}\left[\lvert \lvert \nabla f(x) \rvert \rvert ^2\right] \end{align}$$
Proof of GPI
Proof
By the tensorization lemma, it suffices to do . We will use the central limit theorem and Efron-Stein.
Introduce . Then we have
\text{Var}_{g\sim {\cal N}(0, 1)}(f(g)) &\overset{\text{(CLT)}}\approx \text{Var}\left( f\left( \frac{1}{\sqrt{ k }}\sum_{i=1}^k z_{k} \right) \right) \\ &\overset{(ES)}\leq \frac{1}{2} \sum_{i=1}^k \mathbb{E}_{}\left[\left( f\left( \underbrace{ \frac{1}{\sqrt{ k }}\sum_{j=1}^k z_{j} }_{ \frac{1}{\sqrt{ k }}\sum_{j\neq i}z_{j} +\frac{z_{i}}{\sqrt{ k }} }\right) -f\left( \underbrace{ \frac{1}{\sqrt{ k }} \sum_{j\neq i}z_{j} }_{ S_{i} } + \frac{z_{i}}{\sqrt{ k }} \right) \right)^2\right] \\ &= \frac{1}{4} \sum_{i=1}^k \mathbb{E}_{}\left[\left( f\left( S_{i}+\frac{1}{\sqrt{ k }} \right)-f\left( S_{i}-\frac{1}{\sqrt{ k }} \right) \right)^2\right] \\ &\approx \frac{1}{4} \sum_{i=1}^k \mathbb{E}_{}\left[\left( f'(S_{i}) \frac{2}{\sqrt{ k }} \right)^2\right] \\ &= \mathbb{E}_{}\left[(f'(S_{1}))^2\right] \\ &\approx \mathbb{E}_{g \sim {\cal N}(0,1)}\left[(f'(g))^2\right] \end{align}$$
Example
where . Then and is Lipschitz. Thus
Subgaussian Concentration
Our bounded difference inequality gives us
Theorem McDiarmid / Azumai-Hoefding is -subgaussian And from homework 1, we have
Proof Like Efron-Stein (to get something that looks like ) using a Doob martingale to get something from Hoefding.
Fact (important) Theorem
Theorem (important!)
(Generalization of GPI for Subgaussians)
If and is Lipschitz, then is subgaussian, ie
\mathbb{P}[\,\lvert f(x)-\mathbb{E}_{}\left[f(x)\right] \rvert \geq t] &\leq 2\exp\left( -\frac{t^2}{2L^2} \right) \\ \iff f(x)&= \mathbb{E}_{}\left[f(x)\right] \pm {\cal O}(L)\quad \text{w.h.p.} \end{align}$$
See Gaussian Lipschitz concentration inequality HD Stat Theorem 2.26
"Very Gaussian" proof
Exercise
(homework 4)
Corollaries If and , then
- is 1-subgaussian
- is 2-subgaussian
- is subgaussian
- is 2-subgaussian
- If , then with high probability, is 1-subgaussian (compare to the proof for Johnson-Lindenstrauss)
Log-Sobolev Inequality
For ,
\text{Ent}[X] &= \mathbb{E}_{}\left[X\log X\right] -\mathbb{E}_{}\left[X\right] \log(\mathbb{E}_{}\left[X\right] ) \\ & \mathbb{E}_{}\left[\phi(X)\right] - \phi(\mathbb{E}_{}\left[X\right] ) \end{align}$$ We can think of this as a "generalized variance" Lemma 1 [[Efron-Stein theorem|Efron-Stein]] for Entropy $$\text{Ent}[f(x)] \leq \sum_{i=1}^N \mathbb{E}_{\text{all except }i}\left[\text{Ent}[f(x)]\right] $$ Lemma (Herbst argument) Suppose for all $\lambda$, $\frac{\text{Ent}\left[ {e^{\lambda X}} \right]}{\mathbb{E}_{}\left[e^{\lambda X}\right]} \leq \frac{\sigma^2}{2}\lambda^2$. Then $X$ is $\sigma^2$ [[subgaussian]]. $\frac{\text{Ent}\left[ {e^{\lambda X}} \right]}{\mathbb{E}_{}\left[e^{\lambda X}\right]}$ is like $\psi_{X}(\lambda)$ but with tensorization! Where $\psi_{x}(\lambda)=\log(\mathbb{E}_{}\left[e^{\lambda x}\right])$ is the moment generating function. Def Modified Log Sobolev Inequality Suppose $\mu$ satisfies **MLSI**($C$). If $$\text{Ent}(e^{f(x)}) \leq C\,\mathbb{E}_{x \sim \mu}\left[\lvert \lvert \nabla f(x) \rvert \rvert^2\right] e^{f(x)}$$ see [[MLSI]] Lemma MLSI tensorizes (like [[pointcare inequality|PI]]) Theorem If $\mu$ has [[MLSI]]$(C)$ and $f$ is $L$ [[Lipschitz continuous|Lipschitz]], and $x \sim \mu$, then $f(x)$ is $2CL^\xi$ [[subgaussian]] Theorem ${\cal N}(0, 1)$ has [[MLSI]]$\left( \frac{1}{2} \right)$ ```datacorejsx const { dateTime } = await cJS() return function View() { const file = dc.useCurrentFile(); return <p class="dv-modified">Created {dateTime.getCreated(file)} ֍ Last Modified {dateTime.getLastMod(file)}</p> } ```