Talk by Wouter M. Koolen


MetaGrad: Faster Convergence Without Curvature in Online Convex Optimization


Online convex optimization is a machine learning paradigm that is popular in learning from large data sets. We develop a new style of algorithms for online convex optimization that adapt to the complexity of the functions encountered by learning the learning rate. As a result, we get refined guarantees that express that we improve the sqrt(T) worst-case regret bound whenever the sought optimum is stable. It is well-known that reduced regret is possible with functions that exhibit curvature. Yet our stability is a much weaker requirement; it includes many cases without any curvature. For example any single fixed function, or functions generated by stochastic sources that satisfy a natural condition. Our new algorithms enjoy the efficiency of online gradient descent and AdaGrad. We expect the new adaptivity to be especially useful in practice.

The talk will be self-contained; he will give plenty of motivation, background and examples.

About Wouter M. Koolen

Wouter M. KoolenWouter M. Koolen is a machine learning researcher, specialising in the foundations of learning and sequential decision making. He has a background in computer science and artificial intelligence and he's a Master of Logic.

He works at the Centrum Wiskunde & Informatica in Amsterdam with Peter D. Grünwald. His research project Learning at the Intrinsic Task Pace is funded by a NWO VENI grant. He previously held a Vice-Chancellor's postdoctoral research fellowship at the Queensland University of Technology with  Peter L. Bartlett.

Wouter M. Koolen is interested in designing learning algorithms that progressively improve their decisions, for example by combining the predictions of multiple experts, models, systems, etc. This research area, Online Learning, connects subjects ranging from computer science, statistics, information theory and optimisation to economics. He's also intrigued by the mathematical foundation of ideal learning built atop Kolmogorov Complexity.