DeLTA seminar by Suraj Srinavas: On The (Missing) Foundations of Interpretable Machine Learning

Note that this is a virtual talk. Though, you are welcome to join room UP1-1-1-N116A (Universitetsparken 1, 2100 København Ø) if you would like to enjoy the talk together with others.

Speaker

Suraj Srinavas, postdoctoral research fellow at Harvard University.

Title

On The (Missing) Foundations of Interpretable Machine Learning

Abstract

Despite the recent advances in the art of building large-scale machine learning models, the science underlying their remarkable behaviour is largely in its infancy, contributing to the “black box” nature of these models. Interpretable machine learning aims to address this by building tools to understand and explain model behaviour. In this talk, I will present our recent work on the foundations of interpretable machine learning, highlighting fundamental conceptual bottlenecks. In particular, I will discuss the following issues: (1) the difficulty in formalizing interpretability, obscuring its conceptual goals; (2) the lack of well-defined evaluation metrics, and in particular, a lack of a “ground truth” for interpretability, making progress difficult to measure; and (3) the difficulty in distinguishing between plausible explanations (that aim to convince humans of model correctness) and faithful explanations (that aim to reflect model behaviour accurately). I will present our recent research aiming at addressing these challenges, and also highlight open problems, particularly in the context of recent advances in large language models.

_____________________________

You can subscribe to the DeLTA Seminar mailing list by sending an empty email to delta-seminar-join@list.ku.dk.
Online calendar
DeLTA Lab page

DeLTA is a research group affiliated with the Department of Computer Science at the University of Copenhagen studying diverse aspects of Machine Learning Theory and its applications, including, but not limited to Reinforcement Learning, Online Learning and Bandits, PAC-Bayesian analysis

Datalogisk Institut

DeLTA seminar by Suraj Srinavas: On The (Missing) Foundations of Interpretable Machine Learning

Speaker

Title

Abstract

Detaljer

Højre_DeLTA