# The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction

EADS talk by Kasper Green Larsen, Aarhus University

Abstract: The Johnson-Lindenstrauss lemma, dating back to 1984, says that for any set P of n points in d-dimensional Euclidian space, there exists a map f : P -> R^m, with m = O(eps^{-2} lg n), such that all pairwise distances between points in P are preserved to within a factor of (1+eps) when mapped to m-dimensional Euclidian space using f, i.e. for all p,q in P, it holds that (|p-q|_2)/(1+eps) < |f(p)-f(q)|_2 < (|p-q|_2)(1+eps).  Observe that m does not depend on d, and hence the transformation works even for very high dimensional points (d >> n). Furthermore, all known proofs of the JL-lemma even provides a mapping f which is linear (the mapping f can be represented by an m x d matrix A, and a point p is mapped by computing f(p) = Ap).

The Johnson-Lindenstrauss lemma has found numerous applications in computer science, signal processing (e.g. compressed sensing), statistics and mathematics. The main idea in algorithmic applications is that one can transform a high-dimensional problem into a low-dimensional one such that an optimal solution to the low-dimensional problem can be lifted to a near-optimal solution to the original problem. Due to the decreased dimension, the lower dimensional problem requires fewer resources (time, memory etc.) to solve.

Given the widespread use of dimensionality reduction across several domains, it is a natural and often asked question whether the JL-lemma is tight: Does there exist some set of n points P in R^d such that any map f : P->R^m providing the JL-guarantee must have m = Omega(eps^{-2} lg n). In the original paper of Johnson and Lindenstrauss (1984), the authors proved that there exists a set P requiring m = Omega(lg n). Much later, Alon (2003) proved that there exists a set of n points, P, requiring m = Omega(eps^{-2} lg n / lg(1/eps)), matching the upper bound up to the factor of lg(1/eps). This remained the strongest lower bound prior to our work.

30 years after Johnson and Lindenstrauss' seminal paper, we prove a tight lower bound of m = Omega(eps^{-2}lg n). The lower bound holds for any mapping f which is linear, but as mentioned, all known proofs of the JL-lemma provides a mapping f which is in fact linear. Thus one can interpret our lower bound in two ways, either as strong evidence that the JL-lemma is optimal, or as saying that any further progress must come by designing mappings that are non-linear. This seems to require a completely new approach to dimensionality reduction.

Joint work with Jelani Nelson, Harvard University.

Kasper Green Larsen is Assistant Professor, Ph. D. at MADALGO (Centre for Massive Data Algorithmics) at the Department of Computer Science, Aarhus University.