DeLTA seminar by Suraj Srinavas: On The (Missing) Foundations of Interpretable Machine Learning

Delta logo

Note that this is a virtual talk. Though, you are welcome to join room UP1-1-1-N116A (Universitetsparken 1, 2100 København Ø) if you would like to enjoy the talk together with others.

Portrait of Suraj Srinavas


Suraj Srinavaspostdoctoral research fellow at Harvard University.


On The (Missing) Foundations of Interpretable Machine Learning


Despite the recent advances in the art of building large-scale machine learning models, the science underlying their remarkable behaviour is largely in its infancy, contributing to the “black box” nature of these models. Interpretable machine learning aims to address this by building tools to understand and explain model behaviour. In this talk, I will present our recent work on the foundations of interpretable machine learning, highlighting fundamental conceptual bottlenecks. In particular, I will discuss the following issues: (1) the difficulty in formalizing interpretability, obscuring its conceptual goals; (2) the lack of well-defined evaluation metrics, and in particular, a lack of a “ground truth” for interpretability, making progress difficult to measure; and (3) the difficulty in distinguishing between plausible explanations (that aim to convince humans of model correctness) and faithful explanations (that aim to reflect model behaviour accurately). I will present our recent research aiming at addressing these challenges, and also highlight open problems, particularly in the context of recent advances in large language models.


You can subscribe to the DeLTA Seminar mailing list by sending an empty email to
Online calendar
DeLTA Lab page