PhD defence by Yi-Shan Wu
Second-Order Concentration Inequalities with Application to the Weighted Majority Vote
The weighted majority vote is part of the winning strategies in many machine learning competitions. It is an integral part of random forests and boosting and is also used to combine predictions of heterogeneous classifiers. While generalization guarantees and theoretically grounded optimization algorithms of a weighted majority vote have been a research topic for decades, it is still unclear how to achieve a tight generalization bound that also serves as a faithful optimization objective for achieving good test performance of the resulting weighted majority vote.
There have been extensive studies into second-order analysis that take into account the correlation of errors of the classifiers, which is the key power of the weighted majority vote. However, the existing bounds are either tight but hard to optimize, e.g., the C-bounds, or easy to optimize but come at the cost of being less tight, e.g., the tandem bound. In this thesis, we derive new second-order oracle bounds, where we show it's possible to achieve bounds that are as tight as the C-bounds but remain easy to optimize as the tandem bound.
The concentration of measure inequalities are also needed to transform oracle bounds into empirical bounds. In particular, we derive a new second-order concentration inequality for ternary random variables, which are random variables with a support size of three. Ternary random variables appear in various applications, including the analysis of the weighted majority vote, excess losses, and learning with abstention. The new inequality takes advantage of both the $kl$ inequality that exploits the combinatorial structure in the case of binary random variables, and the Bernstein-type inequalities that are effective if the random variables have low variance.
In this thesis, our focus is on ensemble classifiers and randomized classifiers. To handle this, we use PAC-Bayesian analysis to convert oracle bounds to empirical bounds. This involves applying standard PAC-Bayesian techniques to obtain the PAC-Bayesian versions of various concentration inequalities for random variables.
Professor Wouter Boomsma, Department of Computer Science, UCPH
Associate Professor Tim van Erven, University of Amsterdam, the Netherlands
Directeur de Recherche Olivier Catoni, CNRS, France
Leader of defence: Sadegh Talebi
Principal Supervisor Yevgeny Seldin
Co-Supervisor Anders Østerskov Søgaard
Meeting ID: 627 7477 7483
IT responsible: Chloé Rouyer
For an electronic copy of the thesis, please visit the PhD Programme page.