Random Matrix Lecture 04

[[lecture-data]]

[!info]

Evaluation Error: SyntaxError: Unexpected token '>'

at DataviewInlineApi.eval (plugin:dataview:19027:21)
at evalInContext (plugin:dataview:19028:7)
at asyncEvalInContext (plugin:dataview:19038:32)
at DataviewJSRenderer.render (plugin:dataview:19064:19)
at DataviewJSRenderer.onload (plugin:dataview:18606:14)
at DataviewJSRenderer.load (app://obsidian.md/app.js:1:1182416)
at DataviewApi.executeJs (plugin:dataview:19607:18)
at DataviewCompiler.eval (plugin:digitalgarden:10763:23)
at Generator.next (<anonymous>)
at eval (plugin:digitalgarden:90:61)

Class Notes

Notes pages 15-20 under Singular Values: Geometric Method

Summary

2. Rectangular Matrices

2.3 Spectral Analysis of Rectangular Gaussian Matrices

2.3.1 Singular Values: Geometric Method

Last time: Spectrum of random matrix

GN(0,1)d×m,dm

and especially dm

Δ:=GGTmId=GGTEGGT

But recall the [[operator norm]]:

||Δ||=supxSd1|xTΔx|=supxBd|xTΔx|Bd(x,r):={y:||xy||r}RdBd:=Bd(0,1)Sd1
Restricted Inner Product

Let X=[x1,,xN]Rd and let ΔMd be symmetric. Then we define

||Δ||X:=supxX|xTΔx|=Δ,xxT

see [[restricted inner product]]

This is a [[semi-norm]] - whether or not it is a [[norm]] depends on the set X.

+++ WHEN IS IT A NORM?
To be a norm, outer products need to span the space of symmetric matrices

Proposition

Let MRd×d be a symmetric [[random matrix]] and let XRd be finite. Then

P[||M||X>t]|X|maxxXP[|xTMx|>t]
There is a tradeoff between the factor |X| and the error in approximating the ||Δ||X.

2.3.2 Nets, Coverings, and Packings

Discretizing the operator norm

We want to know if the [[restricted inner product]] approximates the [[operator norm]]

ϵ-net

Let XYRd. X is an ϵ-covering or an ϵ-net of Y if

xXBd(x,ϵ)YyYxX s.t. ||yx||ϵ

see [[epsilon net]]

Lemma

If X is an [[epsilon net]] of Sd1 or Bd, then

||Δ||X||Δ||112ϵ||Δ||X

(if ϵ(0,12) for the second inequality)

Proof

Let xSd1 be such that |xTΔx|=||Δ|| and let xiX such that ||xxi||ϵ. Then

||Δ||X|xiTΔxi|=|(x+(xix))TΔ(x+(xix))|=|xTΔx|||Δ|||xTΔ(xix)|||x||||Δ||||xix||ϵ||Δ|||(xix)TΔxi|ϵ||Δ||(12ϵ)||Δ||
Note

++++ interpretation of special x

see [[epsilon net restricted inner product bounds the operator norm]]

We want to construct our epsilon nets so we can control their size. It is hard to manually describe nets in terms of their vectors, so we need some framework to define them otherwise.


Non-constructive existence of nets

ϵ-packing

Let XY. X is an ϵ-packing of Y if

Bd(x,ϵ)Bd(x,ϵ)=xxX

ie if any two of these balls are sufficiently far apart. ie,

||xx||>2ϵxxX

see [[epsilon packing]]

Lemma

If X is an net.

Maximal

We say that X is maximal if X{x} is not a packing for all xX

Proof

Let yYX. Maximality means Bd(y,ϵ)Bd(x,ϵ) for some xX. Let z be in this intersection

Then by the [[triangle inequality]] and by construction of our balls above, we get

||yx||||yz||+||zx||2ϵ

see [[maximal epsilon packing is also a net]]

We want to show that small nets exist. We have a volumetric bound on the size of packings (including maximal ones), and as we saw above, this means there is a corresponding net. So we have a plan:

  1. Show all epsilon packings are not "too large"
  2. Apply to a maximal [[epsilon packing]]
  3. This means that there exists a 2ϵ net that is not too large
Proposition

Suppose X is an [[epsilon packing]] of Bd. Then |X|(1+1ϵ)d

Note

The true value is is exp(O(dϵ))

Proof

Since X is a packing of Bd, we know

xXBd(x,ε)Bd(0,1+ε)|X|vol(Bd(0,ε))vol(Bd(0,1+ε))|X|vol(Bd(0,1+ε))vol(Bd(0,ε))=(1+εε)d

Note in the first line the LHS union is disjoint, which is how we get the second line.

see [[bound for the size of an epsilon packing]]

Corollary

For all ε>0, there exists an [[epsilon net]] X of Bd with |X|(1+2ϵ)d

Proof

Define X:= any maximal ε2 packing. The result follows from [[maximal epsilon packing is also a net]] and the size bound follows immediately from the [[bound for the size of an epsilon packing]]

see [[we can find an epsilon net with the bound for an epsilon packing]]

2.3.3 Coarse Non-Asymptotic Bound on Singular Values

Putting everything together, we can get a bound on ||Δ||.

Theorem

Ford<m, let GN(0,1)d×m. Define Δ:=GGTmId. Then

P[||Δ||64dm]2exp(d)

Note: \lvert \lvert G \rvert \rvert { #2} =\lvert \lvert GG^T \rvert \rvert \leq m+{\cal O}(\sqrt{ dm })

Proof

Let ε:=14. Then we have:

  1. There exists an [[epsilon net]] X of Bd where we our [[bound for the size of an epsilon packing]] yields
|X|(1+2ε)d=(1+214)d=9d
  1. We also have the bound (from [[epsilon net restricted inner product bounds the operator norm]])
||Δ||112ε||Δ||X=112(14)||Δ||X=2||Δ||X

So

P[||Δ||t]P[||Δ||Xt2]union boundxXP[|xTΔx|GGTmIdt2]

Since xXBd, we know ||x||1. So let x^:=x||x||Sd1. Then

xXP[|x^TΔx^|t2]=xXPG[|||GTx^Rm,Law=N(0,Im)||2m|t2]

Now, let g=GTx^

\begin{align} \implies &= \lvert {\cal X} \rvert \cdot \mathbb{P}_{g \sim N(0, I_{m})}\left[ \lvert\,\lvert \lvert g \rvert \rvert { #2} -m \,\rvert \geq \frac{t}{2} \right] \\ &\leq 9^d\cdot 2\exp\left( -\frac{1}{8}\min\left\{ \frac{t^2}{4m}, \frac{t}{2} \right\} \right) \\ &\leq 2\exp\left( 3d - \frac{1}{8} \min\left\{ \frac{C^2}{4}d, \underbrace{\frac{C}{2}\sqrt{ dm }}_{d\leq m \implies \sqrt{ dm } \geq d} \right\} \right) \end{align}

And setting t:=Cdm and noting that 9exp(3), we get

2exp(d[3min{C232,C16}C=64min(>4,4)])()C:=642exp(d)

For (), we want to pick C big enough that everything is negative.

Union Bound

the union bound is countable subadditivity or the fact that

P(iAi)iP(Ai)

see [[high probability bound for operator norm of difference for Gaussian covariance matrix]]

From [[Weyl's Theorem]], we can get the following result:

Corollary (special case)

Let A,BMd be symmetric. Then

λd(A)||B||λd(A+B)λ1(A+B)λ1(A)+||B||

Where |||| is the [[operator norm]].

Proof

The proof is similar to the one proving Weyl, except we can use the variational description of the eigenvalues instead of Courant-Fisher. For the first bound, we have

λ1(A+B)=supxSd1xT(A+B)xsupxSd1(xTAx+||B||)=supxSd1xTAx+||B||=λ1(A)+||B||

(by just distributing the sup and by definition of the [[operator norm]]). And, similarly, we have

λd(A+B)=infxSd1xT(A+B)xinfxSd1(xTAx||B||)=(infxSd1xTAx)||B||=λd(A)||B||
Corollary

Let GRd×m be a Gaussian [[random matrix]] with dm. Then

P[m64dσd(G)σ1(G)m+32d]>12exp(d)
Important

ie all singular values are {m±O(d)}

Note

This is a good event; we want this to happen. And luckily, we have high probability for this event.

Proof

Recall

First-order Taylor approximation (and concavity of ) yields

1+x1+x2for x01x1xfor 0x1

If the event for our [[high probability bound for operator norm of difference for Gaussian covariance matrix]] happens, then we have

||Δ||=||GGTmId||64dm

ie if the event happens, we have
(1)

σ1(G)=||G||=||GGT||()||mId||+||Δ||m+64dm=m1+64dm()m(1+642dm)=m+32d

Where

Then, for σd(G), we have a similar bound.

Note that if m64dm<0, then m64d<0 and the result follows. So suppose m64dm0.

(2)

σd(G)=λd(GGT)()λd(mId)||Δ||m64dm=m164dm()m(164dm)=m64d

Where

see [[high probability bound on singular values of gaussian random matrix]]

GTGRm×m

eigenvalues of 1mGTG for dm
we know
λ1λdλd+1λm0

λ1λd1
λd+1λm0

approximately a random projection matrix 1mGTG

Next time:

Review

#flashcards/math/rmt

TODO

const { dateTime } = await cJS()

return function View() {
	const file = dc.useCurrentFile();
	return <p class="dv-modified">Created {dateTime.getCreated(file)}     ֍     Last Modified {dateTime.getLastMod(file)}</p>
}