Johnson-Lindenstrauss lemma

[[concept]]

Topics

const fieldName = "theme"; // Your field with links
const oldPrefix = "Thoughts/01 Themes/";
const newPrefix = "Digital Garden/Topics/";
 
const relatedLinks = dv.current()[fieldName];
 
if (Array.isArray(relatedLinks)) {
    // Map over the links, replace the path, and output only clickable links
    dv.el("span",
        relatedLinks
            .map(link => {
                if (link && link.path) {
                    let newPath = link.path.startsWith(oldPrefix)
                        ? link.path.replace(oldPrefix, newPrefix)
                        : link.path;
                    return dv.fileLink(newPath);
                }
            })
            .filter(Boolean).join(", ") 
	// Remove any undefined/null items
    );
} else {
    dv.el(dv.current().theme);
}
 
 

Theorem

Theorem (Johnson-Lindenstrauss)

Let and fix . Let . Suppose that And denote as the event that multiplication by is pairwise - faithful for the . Then we have

Proof

Note that being -faithful pairwise to the requires the preservation of the vector norms within a factor of .

note we can do this for any vectors and in particular the specified in the statement!

Let . Recall that the distribution of is . Thus we have

\mathbb{P}_{\hat{G} \sim {\cal N\left( 0, \frac{1}{d} \right)}^{\otimes d\times m}} \left[\,\left\lvert \,\lvert \lvert \hat{G}x \rvert \rvert^2 - \lvert \lvert x \rvert \rvert ^2 \, \right\rvert >\varepsilon \lvert \lvert x \rvert \rvert ^2\,\right] &= \mathbb{P}_{g \sim {\cal N}\left( 0, \frac{1}{d} \lvert \lvert x \rvert \rvert ^2 \right)} \left[ \,\left\lvert \,\lvert \lvert g \rvert \rvert ^2 - \lvert \lvert x \rvert \rvert ^2\, \right\rvert > \varepsilon \lvert \lvert x \rvert \rvert ^2 \, \right] \\ &= \mathbb{P}_{g \sim {\cal N}(0, I_{d})} \left[\, \left\lvert \frac{\,1}{d}\lvert \lvert x \rvert \rvert ^2 \cdot \lvert \lvert g \rvert \rvert ^2 - \lvert \lvert x \rvert \rvert ^2 \, \right\rvert >\varepsilon \lvert \lvert x \rvert \rvert ^2\, \right] \\ &= \mathbb{P}_{g \sim {\cal N}(0, I_d)} \left[ \,\left\lvert \,\lvert \lvert g \rvert \rvert ^2 - d\, \right\rvert > \varepsilon d\, \right] \end{align}$$ So we can apply our [[Concept Wiki/concentration inequality for magnitude of standard gaussian random vector]]! Since $\varepsilon<1$, we have $\varepsilon d<d$ and thus $\min \left\{ \frac{(\varepsilon d)^2}{d}, \varepsilon d \right\}=\varepsilon^2d$. Thus we have $$\begin{align} \mathbb{P}_{\hat{G} \sim {\cal N}\left( 0, \frac{1}{d} \right)^{\otimes d\times m}} \left[\,\left\lvert \,\lvert \lvert \hat{G}x \rvert \rvert^2 - \lvert \lvert x \rvert \rvert ^2 \, \right\rvert >\varepsilon \lvert \lvert x \rvert \rvert ^2\,\right] &\leq 2 \exp\left( -\frac{1}{8} \varepsilon^2d \right) \\ \implies \mathbb{P}_{\hat{G} \sim {\cal N}\left( 0, \frac{1}{d} \right)^{\otimes d\times m}} \left[\,(1-\varepsilon) \lvert \lvert x \rvert \rvert ^2 \leq\,\lvert \lvert \hat{G}x \rvert \rvert^2 \leq(1+\varepsilon)\lvert \lvert x \rvert \rvert ^2\,\right] &\geq 1 - 2\exp\left( -\frac{1}{8}\varepsilon^2d \right) \end{align}$$ Now, let $S_{ij}$ be the event that $(1-\varepsilon) \lvert \lvert x \rvert \rvert ^2 \leq\,\lvert \lvert \hat{G}x \rvert \rvert^2 \leq(1+\varepsilon)\lvert \lvert x \rvert \rvert ^2$ for $x=y_{i}-y_{j}$. Then by the [[Concept Wiki/outer measure has countable subadditivity\|union bound]], we have $$\begin{align} \mathbb{P}[S_{ij}^C \text{ for some }i,j] & \leq {n \choose 2} \cdot 2 \exp\left( \frac{1}{8} \varepsilon^2 d \right) \\ &\leq n^2 \exp\left( -\frac{1}{8} \varepsilon^2d \right) \\ &=\frac{1}{n} \exp\left( 3\log n-\frac{1}{8}\varepsilon^2\underbrace{ d }_{ d \geq 24 \frac{\log n}{\varepsilon^2}} \right) \\ \implies\quad &\leq \frac{1}{n} \exp\left( 3 \log n - 3\log n \right) \\ &=\frac{1}{n} \end{align}$$ $$\tag*{$\blacksquare$}$$

Note

Johnson-Lindenstrauss only deals with the pairwise distances between the and there are other aspects of the geometry that it does not address.

Despite this, it is an intuitive result and is still relevant to numerous applications.

Note

The proof of this lemma uses the first moment method

Extensions

It is easy to see that with the same probability bound by adding to the collection of and increasing to .

We can also write an expression for the bound on such that if then we can get a success probability of .

Johnson-Lindenstrauss Transform

Johnson-Lindenstrauss Transform

Multiplying data by a random matrix that satisfies the criteria for the Johnson-Lindenstrauss lemma is sometimes called the Johnson-Lindnestrauss transform (JLT).

This is usually done for the purpose of dimensionality reduction.

There is also work being done to see how to speed up the multiplications . The main approach to this is to find special matrices that allow for faster matrix-vector multiplication.

  • Simplest results deal with sparse matrices
  • Also fast Fourier Transform via multiplication with the discrete Fourier transform matrix
    • multiply then subsample entries

Optimality

The result is not optimal. The original paper showed that dimension is required independent of .

The following two papers show that the general result is tight up to constants:

  • Kasper Green Larsen and Jelani Nelson. The johnson-lindenstrauss lemma is optimal for linear dimensionality reduction. arXiv preprint arXiv:1411.2404, 2014.
  • Kasper Green Larsen and Jelani Nelson. Optimality of the johnson-lindenstrauss lemma. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 633–638. IEEE, 2017.

References

References

See Also

Mentions

Mentions

TABLE mod as "Last Modified"
FROM [[]]
 
FLATTEN choice(contains(artist, this.file.link) OR contains(author, this.file.link) OR contains(director, this.file.link) OR contains(source, this.file.link), 1, "") AS direct_source
 
FLATTEN contains(type, "thought") AS is_thought
FLATTEN contains(tags, "dailynote") AS is_periodic
 
FLATTEN choice(length(modified) > 0, max(dateformat(modified, "yyyy-MM-dd")), file.mday) AS mod
 
WHERE !direct_source
    AND !is_thought
    AND !is_periodic
    AND file != this.file
 
SORT mod DESC
SORT file.name ASC
const { dateTime } = await cJS()
 
return function View() {
	const file = dc.useCurrentFile();
	return <p class="dv-modified">Created {dateTime.getCreated(file)}     ֍     Last Modified {dateTime.getLastMod(file)}</p>
}