concentration inequality for magnitude of standard gaussian random vector

[[concept]]

Topics

const fieldName = "theme"; // Your field with links
const oldPrefix = "Thoughts/01 Themes/";
const newPrefix = "Digital Garden/Topics/";
 
const relatedLinks = dv.current()[fieldName];
 
if (Array.isArray(relatedLinks)) {
    // Map over the links, replace the path, and output only clickable links
    dv.el("span",
        relatedLinks
            .map(link => {
                if (link && link.path) {
                    let newPath = link.path.startsWith(oldPrefix)
                        ? link.path.replace(oldPrefix, newPrefix)
                        : link.path;
                    return dv.fileLink(newPath);
                }
            })
            .filter(Boolean).join(", ") 
	// Remove any undefined/null items
    );
} else {
    dv.el(dv.current().theme);
}
 
 

Statement

Theorem

Let be a standard gaussian random vector and . Then

\mathbb{P}[\left\lvert \,\lvert \lvert x \rvert \rvert^2 -d \, \right\rvert \geq t] &= \mathbb{P}\left[ \left\lvert \sum_{i=1}^d x_{i}^2 - d \right\rvert \geq t \right] \\ &\leq 2 \begin{cases} \exp\left( -\frac{t^2}{8d} \right)&\text{if } t \leq d \\ \exp\left( -\frac{t}{8} \right) & \text{if }t \geq d \end{cases} \\ &= 2\exp\left( -\frac{1}{8} \min \left\{ \frac{t^2}{d}, t \right\}\right) \end{align}$$

Important

This tells us that, with high probability, we have

\lvert \lvert x \rvert \rvert ^2 &= d + {\cal O}(\sqrt{ d }) \\ \implies \lvert \lvert x \rvert \rvert &= \sqrt{ d + {\cal O}(\sqrt{ d }) } \\ &= \sqrt{ d } \cdot \sqrt{ 1+ {\cal O}\left( \frac{1}{\sqrt{ d }} \right)} \\ &= \sqrt{ d }\left( 1+{\cal O}\left( \frac{1}{\sqrt{ d }} \right) \right) \\ &= \sqrt{ d } + {\cal O}(1) \end{align}$$ ie, the standard [[Concept Wiki/gaussian random vector]] falls close to the spherical shell of width ${\cal O}(1)$ around the sphere of radius $\sqrt{ d }$.

Proof

NOTE

This proof uses the Chernoff Method, which is used to prove concentration inequalities like the Chebyshev, Hoeffding, Bernstein, Azuma-Hoeffding, and McDiarmid. It is a more general form of a “nonlinear Markov” inequality

Proof

We deal with only one side of the distribution; the desired inequality is achieved by simply multiplying by 2 to account for the other tail.

Note that , so define . Then

\mathbb{P}\left[ \sum_{i=1}^d x_{i}^2 - d \geq t\right] &= \mathbb{P}\left[ \sum_{i-1}^ds_{i} \geq t \right] \\ &= \mathbb{P}\left[ \exp\left( \lambda \sum_{i=1}^d s_{i} \right) \geq \exp(\lambda t) \right] \\ (*)& \leq \frac{\mathbb{E}\left[ \exp\left( \lambda \sum s_{i} \right) \right]}{\exp(\lambda t)} \\ (**)& = \frac{\mathbb{E}[\exp(\lambda s_{1})]^d}{\exp(\lambda t)} \\ &= \exp(-\lambda t + d \log(\mathbb{E}[\exp(\lambda s_{1})])) \end{align}$$ Where - $(*)$ is from [[Concept Wiki/Markov's Inequality]] - $(* *)$ is because the $x_{i}$ are independent Now, define $$\psi(\lambda) := \log(\mathbb{E}[\exp(\lambda s_{1})]) = \log(\mathbb{E}[\exp(\lambda(x_{1}^2 - 1))])$$ As the *cumulant generating function* ([wikipedia](https://en.wikipedia.org/wiki/Cumulant)) of $x_{1}^2-1$ (which has expectation 0). Note that $\mathbb{E}[\exp(\lambda x_{1}^2)]$ is finite if and only if $\lambda < \frac{1}{2}$ and in this case $$\begin{align} \mathbb{E}[\exp(\lambda x_{1}^2)] &= \frac{1}{\sqrt{ 1-2\lambda }} \\ \implies \psi(\lambda) &= -\lambda + \frac{1}{2}\log\left( \frac{1}{1-2\lambda} \right) \\ ({\dagger})&= \lambda^2+{\cal O}(\lambda) \end{align}$$ Where we get $({\dagger})$ via Taylor expansion. This yields the bound $\psi(\lambda) \leq 2\lambda^2$ for $\lambda \leq \frac{1}{4}$ >[!exercise]- Exercise (tedious) - Verify the bound >- Can plot $\psi(\lambda)$ and the bound $2\lambda^2$ (see notes page 6; figure 1.1) >- Verify via algebra that $\psi(\lambda) \leq 2\lambda^2$ for all $\lambda \leq \frac{1}{4}$ Assuming $\lambda \leq \frac{1}{4}$, we have $$\begin{align} \mathbb{P}\left[ \sum_{i=1}^d x_{i}^2 - d \geq t \right] &\leq \exp(-\lambda t + d \log(\mathbb{E}[\exp(\lambda s_{1})])) \\ &=\exp(-\lambda t + d\psi(\lambda)) \\ &\leq \exp(-\lambda t + 2d\lambda^2) \end{align}$$ Now, $2d\lambda^2 -\lambda t$ is a convex quadratic function of $\lambda$, and is thus minimized at $\lambda^* = \frac{t}{4d}$. We want to set $\lambda=\lambda^*$, but we must ensure that $\lambda \leq \frac{1}{4}$ to maintain our desired bound. Thus, we have 2 cases: - If $t \leq d$, then $\lambda^* \leq \frac{1}{4}$ and we can safely set $\lambda = \lambda^*$. $$\mathbb{P}\left[ \sum_{i=1}^d x_{i}^2 - d \geq t \right] \leq \exp(-\lambda t + 2d\lambda^2)=\exp\left( -\frac{t^2}{4d} \right)$$ - If $t \geq d$, then $\frac{t}{4d} \not\leq \frac{1}{4}$, so we instead set $\lambda=\frac{1}{4}$. $$\mathbb{P}\left[ \sum_{i=1}^d x_{i}^2 - d \geq t \right] \leq \exp(-\lambda t + 2d\lambda^2)=\exp\left( -\frac{t}{4}+\frac{d}{8} \right) \leq \exp\left( -\frac{t}{8} \right)$$ Which is the desired result $$\tag*{$\blacksquare$}$$

Remarks

NOTE

The important part of the proof is finding some for a bounded . This means that the random variance is subexponential (a weaker version of being subgaussian).

This result is characteristic of sums of iid subexponential random variables. Sums of this type have “Gaussian tails” scaled by up to a cutoff of , after which they have “exponential tails” scaled by

Similar Results on the Way

By using Markov’s Inequality and Chebyshev’s Inequality, we can get something close to this.

By Markov, we have

And by Chebyshev we get

\mathbb{P}[\lvert \lvert x \rvert \rvert ^2 \geq d + t] &= \mathbb{P}[\lvert \lvert x \rvert \rvert ^2 \geq \mathbb{E}[\,\lvert \lvert x \rvert \rvert ^2] + t] \\ & \leq \frac{\text{Var}(\lvert \lvert x \rvert \rvert ^2)}{t^2} \\ &= \frac{2d}{t^2} \end{align}$$ Where $\text{Var}[\lvert \lvert x \rvert \rvert^2] = d \text{Var}[x_{1}^2]=2d$. So this gives $\lvert \lvert x \rvert \rvert^2 = {\cal O}(\sqrt{ d })$ with high probability when $0 \leq t- \sqrt{ d } < \varepsilon$ Both of these results are a good start, but depend a lot on the value of $t$. # References >[!references] > ## See Also - ### Mentions > [!mentions]+ > > > ```datacorejsx > const modules = await cJS() > > const COLUMNS = [ > { id: "Name", value: page => page.$link }, > { id: "Last Modified", value: page => modules.dateTime.getLastMod(page) }, > ]; > > return function View() { > const current = dc.useCurrentFile(); > // Selecting `#game` pages, for example. > let queryString = `@page and linksto(${current.$link})`; > let pages = dc.useQuery(queryString); > > // check types > pages = pages.filter( (p) => !modules.typeCheck.checkAll(p, current) ).sort() > > > return <dc.Table columns={COLUMNS} rows={pages} paging={20}/>; > } > ``` > ```datacorejsx const { dateTime } = await cJS() return function View() { const file = dc.useCurrentFile(); return <p class="dv-modified">Created {dateTime.getCreated(file)} ֍ Last Modified {dateTime.getLastMod(file)}</p> } ```