Note that this holds for anyd, including d≪m and d=1. This is because we are taking expectations.
If y is selected first before we realize G^, then we hope that ∣∣G^y∣∣ will concentrate about ∣∣y∣∣. Our calculations for the joint show that
Law(G^y)=N(0,d1∣∣y∣∣2Id)
So (with rescaling, obviously), our concentration inequality for magnitude of standard gaussian random vector holds and we should see the desired behavior.
If y does not depend on G^
Assuming we ensure G^⊥y, the probability of actually preserving geometry depends on d.
d=1, then G^=g⊺ for some g∈N(0,Id). We still have
E[∣∣G^yi∣∣2]=E[⟨g,yi⟩2]=∣∣yi∣∣2
But clearly ⟨g,yi⟩2≈∣∣yi∣∣2 cannot hold simultaneously for many yi regardless of if we choose them before realizing G^.
If
Exercise
Construct some examples of this case
If y depends on G^...
If d≪m, then once G^ is realized, we can pick y∈Null(G^) so that 0=∣∣G^y∣∣≪∣∣y∣∣.
In this case, G^ will not preserve the geometry of arbitrary vectors y.
In particular, we can find vectors y that depend on G^ such that the geometry is badly distorted.
Normalized gaussian random matrices preserve geometry (in expectation), BUT
Assuming we ensure G^⊥y, the probability of {1||preserving geometry} depends on {==2||the “output” dimension d==}.
For G^∼N(0,d1)⊗d×m and y∈Rm, what is the (co)variance matrix of G^y?
-?-
d1∣∣y∣∣2Id
If G^∼N(0,d1)⊗d×m, why do we need y∈Rm to be independent of G^ to preserve geometry if d<m?
-?-
Otherwise, we can choose y∈Null(G^) and then the norm is not preserved.