Estimating the Effective Support Size in Constant Query Complexity
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Estimating the Effective Support Size in Constant Query Complexity. / Narayanan, Shyam; Tětek, Jakub.
Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA). ed. / Telikepalli Kavitha; Kurt Mehlhorn. Society for Industrial and Applied Mathematics, 2023. p. 242-252.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Estimating the Effective Support Size in Constant Query Complexity
AU - Narayanan, Shyam
AU - Tětek, Jakub
PY - 2023
Y1 - 2023
N2 - Estimating the support size of a distribution is a well-studied problem in statistics. Motivated by the fact that this problem is highly non-robust (as small perturbations in the distributions can drastically affect the support size) and thus hard to estimate, Goldreich [ECCC 2019] studied the query complexity of estimating the ε-effective support size Essε of a distribution P, which is equal to the smallest support size of a distribution that is ε-far in total variation distance from P.In his paper, he shows an algorithm in the dual access setting (where we may both receive random samples and query the sampling probability p(x) for any x) for a bicriteria approximation, giving an answer in [Ess(1+β)ε, (1 + γ) Essε] for some values β, γ > 0. However, his algorithm has either super-constant query complexity in the support size or super-constant approximation ratio 1 + γ = ω(1). He then asked if this is necessary, or if it is possible to get a constant-factor approximation in a number of queries independent of the support size.We answer his question by showing that not only is complexity independent of n possible for γ > 0, but also for γ = 0, that is, that the bicriteria relaxation is not necessary. Specifically, we show an algorithm with query complexity . That is, for any 0 < ε, β < 1, we output in this complexity a number ñ ∊ [Ess(1+β)ε, Essε]. We also show that it is possible to solve the approximate version with approximation ratio 1 + γ in complexity .
AB - Estimating the support size of a distribution is a well-studied problem in statistics. Motivated by the fact that this problem is highly non-robust (as small perturbations in the distributions can drastically affect the support size) and thus hard to estimate, Goldreich [ECCC 2019] studied the query complexity of estimating the ε-effective support size Essε of a distribution P, which is equal to the smallest support size of a distribution that is ε-far in total variation distance from P.In his paper, he shows an algorithm in the dual access setting (where we may both receive random samples and query the sampling probability p(x) for any x) for a bicriteria approximation, giving an answer in [Ess(1+β)ε, (1 + γ) Essε] for some values β, γ > 0. However, his algorithm has either super-constant query complexity in the support size or super-constant approximation ratio 1 + γ = ω(1). He then asked if this is necessary, or if it is possible to get a constant-factor approximation in a number of queries independent of the support size.We answer his question by showing that not only is complexity independent of n possible for γ > 0, but also for γ = 0, that is, that the bicriteria relaxation is not necessary. Specifically, we show an algorithm with query complexity . That is, for any 0 < ε, β < 1, we output in this complexity a number ñ ∊ [Ess(1+β)ε, Essε]. We also show that it is possible to solve the approximate version with approximation ratio 1 + γ in complexity .
U2 - 10.1137/1.9781611977585.ch22
DO - 10.1137/1.9781611977585.ch22
M3 - Article in proceedings
SP - 242
EP - 252
BT - Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA)
A2 - Kavitha, Telikepalli
A2 - Mehlhorn, Kurt
PB - Society for Industrial and Applied Mathematics
T2 - 2023 Symposium on Simplicity in Algorithms (SOSA)
Y2 - 23 January 2023 through 25 January 2023
ER -
ID: 382689139