DeLTA seminar by Suraj Srinavas: On The (Missing) Foundations of Interpretable Machine Learning
Note that this is a virtual talk. Though, you are welcome to join room UP1-1-1-N116A (Universitetsparken 1, 2100 København Ø) if you would like to enjoy the talk together with others.
Speaker
Suraj Srinavas, postdoctoral research fellow at Harvard University.
Title
On The (Missing) Foundations of Interpretable Machine Learning
Abstract
Despite the recent advances in the art of building large-scale machine learning models, the science underlying their remarkable behaviour is largely in its infancy, contributing to the “black box” nature of these models. Interpretable machine learning aims to address this by building tools to understand and explain model behaviour. In this talk, I will present our recent work on the foundations of interpretable machine learning, highlighting fundamental conceptual bottlenecks. In particular, I will discuss the following issues: (1) the difficulty in formalizing interpretability, obscuring its conceptual goals; (2) the lack of well-defined evaluation metrics, and in particular, a lack of a “ground truth” for interpretability, making progress difficult to measure; and (3) the difficulty in distinguishing between plausible explanations (that aim to convince humans of model correctness) and faithful explanations (that aim to reflect model behaviour accurately). I will present our recent research aiming at addressing these challenges, and also highlight open problems, particularly in the context of recent advances in large language models.
_____________________________
You can subscribe to the DeLTA Seminar mailing list by sending an empty email to delta-seminar-join@list.ku.dk.
Online calendar
DeLTA Lab page
DeLTA is a research group affiliated with the Department of Computer Science at the University of Copenhagen studying diverse aspects of Machine Learning Theory and its applications, including, but not limited to Reinforcement Learning, Online Learning and Bandits, PAC-Bayesian analysis