Skip to ContentSkip to Navigation
Research Bernoulli Institute Calendar

AILo Talk - Gabriele Sarti, University of Groningen

When:Tu 23-05-2023 16:00 - 18:00
Where:5161.0222 Bernoulliborg

Title: Interpreting Language Models with Feature Attribution

Abstract:

In recent years, Transformer-based language models have achieved remarkable progress in most language generation and understanding tasks. However, the internal computations of these models are hardly interpretable due to their highly nonlinear structure, hindering their usage for mission-critical applications requiring trustworthiness and transparency guarantees. This presentation will introduce interpretability methods used for tracing the predictions of language models back to their inputs and discuss how these can be used to gain insights into model biases and behaviors. Throughout the presentation, several concrete examples of language model attributions will be presented using the Inseq interpretability library.

Gabriele Sarti is a Ph.D. student in the Computational Linguistics Group (GroNLP) at the University of Groningen, Netherlands. Previously, he worked as a research intern at Amazon Translate NYC, a research scientist at Aindo, and a research assistant at the ItaliaNLP Lab (CNR-ILC, Pisa). His research aims to improve our understanding of generative neural language models’ inner workings, aiming to enhance the controllability and robustness of these systems for human-AI collaboration.