DeLTA seminar by Emmanuel Esposito

Speaker

Emmanuel Esposito from the University of Milan.

Title

Improved Delay-Robustness in Online Learning and Bandits.

Abstract

Sequential decision-makers, from recommender systems to adaptive experiments, rarely receive feedback on time: clicks may happen hours later, gradients arrive after network lag, and some outcomes stay missing altogether. The per-round delay $d$ can inflate regret from the ideal $\sqrt{T}$ to $\sqrt{dT}$ over $T$ rounds, degrading the overall performance. In this talk, we illustrate how to obtain improved robustness to delays of online learning algorithms in the presence of natural structural assumptions. First, we show how to achieve fast rates in online convex optimization with curved losses, achieving a dependence on the maximum number of missing observations while maintaining optimal dependencies on all other parameters. Second, in the classical multi-armed bandit problem, we demonstrate how the dependence on feedback delays can be improved in the presence of intermediate signals to a quantity that captures the signal richness.

Bio

Emmanuel is a postdoctoral Researcher at LAILA (University of Milan) hosted by Nicolò Cesa-Bianchi. Previously, I obtained my PhD in Computer Science at the University of Milan and the Italian Institute of Technology, where I was fortunate to be supervised by Nicolò Cesa-Bianchi and Massimiliano Pontil.
My research interests broadly lie in Online Learning and Machine Learning Theory. One of the key focuses of my research is to understand the interplay between feedback models and the hardness of sequential decision-making problems. I am also generally interested in their intersection with other areas of machine learning, such as Reinforcement Learning, Game Theory, and Optimization.

Join the DeLTA community

You can subscribe to the DeLTA Seminar mailing list by sending an empty email to delta-seminar-join@list.ku.dk<mailto:delta-seminar-join@list.ku.dk>
DeLTA online calendar
DeLTA Lab page