Estimating the Effective Support Size in Constant Query Complexity

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

Estimating the Effective Support Size in Constant Query Complexity. / Narayanan, Shyam; Tětek, Jakub.

Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA). ed. / Telikepalli Kavitha; Kurt Mehlhorn. Society for Industrial and Applied Mathematics, 2023. p. 242-252.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Narayanan, S & Tětek, J 2023, Estimating the Effective Support Size in Constant Query Complexity. in T Kavitha & K Mehlhorn (eds), Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA). Society for Industrial and Applied Mathematics, pp. 242-252, 2023 Symposium on Simplicity in Algorithms (SOSA), Florence, Italy, 23/01/2023. https://doi.org/10.1137/1.9781611977585.ch22

APA

Narayanan, S., & Tětek, J. (2023). Estimating the Effective Support Size in Constant Query Complexity. In T. Kavitha, & K. Mehlhorn (Eds.), Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA) (pp. 242-252). Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611977585.ch22

Vancouver

Narayanan S, Tětek J. Estimating the Effective Support Size in Constant Query Complexity. In Kavitha T, Mehlhorn K, editors, Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA). Society for Industrial and Applied Mathematics. 2023. p. 242-252 https://doi.org/10.1137/1.9781611977585.ch22

Author

Narayanan, Shyam ; Tětek, Jakub. / Estimating the Effective Support Size in Constant Query Complexity. Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA). editor / Telikepalli Kavitha ; Kurt Mehlhorn. Society for Industrial and Applied Mathematics, 2023. pp. 242-252

Bibtex

@inproceedings{834a3603f246447cb2105f800f103b3d,
title = "Estimating the Effective Support Size in Constant Query Complexity",
abstract = "Estimating the support size of a distribution is a well-studied problem in statistics. Motivated by the fact that this problem is highly non-robust (as small perturbations in the distributions can drastically affect the support size) and thus hard to estimate, Goldreich [ECCC 2019] studied the query complexity of estimating the ε-effective support size Essε of a distribution P, which is equal to the smallest support size of a distribution that is ε-far in total variation distance from P.In his paper, he shows an algorithm in the dual access setting (where we may both receive random samples and query the sampling probability p(x) for any x) for a bicriteria approximation, giving an answer in [Ess(1+β)ε, (1 + γ) Essε] for some values β, γ > 0. However, his algorithm has either super-constant query complexity in the support size or super-constant approximation ratio 1 + γ = ω(1). He then asked if this is necessary, or if it is possible to get a constant-factor approximation in a number of queries independent of the support size.We answer his question by showing that not only is complexity independent of n possible for γ > 0, but also for γ = 0, that is, that the bicriteria relaxation is not necessary. Specifically, we show an algorithm with query complexity . That is, for any 0 < ε, β < 1, we output in this complexity a number {\~n} ∊ [Ess(1+β)ε, Essε]. We also show that it is possible to solve the approximate version with approximation ratio 1 + γ in complexity .",
author = "Shyam Narayanan and Jakub T{\v e}tek",
year = "2023",
doi = "10.1137/1.9781611977585.ch22",
language = "English",
pages = "242--252",
editor = "Telikepalli Kavitha and Kurt Mehlhorn",
booktitle = "Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA)",
publisher = "Society for Industrial and Applied Mathematics",
address = "United States",
note = "2023 Symposium on Simplicity in Algorithms (SOSA) ; Conference date: 23-01-2023 Through 25-01-2023",

}

RIS

TY - GEN

T1 - Estimating the Effective Support Size in Constant Query Complexity

AU - Narayanan, Shyam

AU - Tětek, Jakub

PY - 2023

Y1 - 2023

N2 - Estimating the support size of a distribution is a well-studied problem in statistics. Motivated by the fact that this problem is highly non-robust (as small perturbations in the distributions can drastically affect the support size) and thus hard to estimate, Goldreich [ECCC 2019] studied the query complexity of estimating the ε-effective support size Essε of a distribution P, which is equal to the smallest support size of a distribution that is ε-far in total variation distance from P.In his paper, he shows an algorithm in the dual access setting (where we may both receive random samples and query the sampling probability p(x) for any x) for a bicriteria approximation, giving an answer in [Ess(1+β)ε, (1 + γ) Essε] for some values β, γ > 0. However, his algorithm has either super-constant query complexity in the support size or super-constant approximation ratio 1 + γ = ω(1). He then asked if this is necessary, or if it is possible to get a constant-factor approximation in a number of queries independent of the support size.We answer his question by showing that not only is complexity independent of n possible for γ > 0, but also for γ = 0, that is, that the bicriteria relaxation is not necessary. Specifically, we show an algorithm with query complexity . That is, for any 0 < ε, β < 1, we output in this complexity a number ñ ∊ [Ess(1+β)ε, Essε]. We also show that it is possible to solve the approximate version with approximation ratio 1 + γ in complexity .

AB - Estimating the support size of a distribution is a well-studied problem in statistics. Motivated by the fact that this problem is highly non-robust (as small perturbations in the distributions can drastically affect the support size) and thus hard to estimate, Goldreich [ECCC 2019] studied the query complexity of estimating the ε-effective support size Essε of a distribution P, which is equal to the smallest support size of a distribution that is ε-far in total variation distance from P.In his paper, he shows an algorithm in the dual access setting (where we may both receive random samples and query the sampling probability p(x) for any x) for a bicriteria approximation, giving an answer in [Ess(1+β)ε, (1 + γ) Essε] for some values β, γ > 0. However, his algorithm has either super-constant query complexity in the support size or super-constant approximation ratio 1 + γ = ω(1). He then asked if this is necessary, or if it is possible to get a constant-factor approximation in a number of queries independent of the support size.We answer his question by showing that not only is complexity independent of n possible for γ > 0, but also for γ = 0, that is, that the bicriteria relaxation is not necessary. Specifically, we show an algorithm with query complexity . That is, for any 0 < ε, β < 1, we output in this complexity a number ñ ∊ [Ess(1+β)ε, Essε]. We also show that it is possible to solve the approximate version with approximation ratio 1 + γ in complexity .

U2 - 10.1137/1.9781611977585.ch22

DO - 10.1137/1.9781611977585.ch22

M3 - Article in proceedings

SP - 242

EP - 252

BT - Proceedings, 2023 Symposium on Simplicity in Algorithms (SOSA)

A2 - Kavitha, Telikepalli

A2 - Mehlhorn, Kurt

PB - Society for Industrial and Applied Mathematics

T2 - 2023 Symposium on Simplicity in Algorithms (SOSA)

Y2 - 23 January 2023 through 25 January 2023

ER -

ID: 382689139