Skip to ContentSkip to Navigation
Jantina Tammes School of Digital Society, Technology and AI
Digital prosperity for all
Jantina Tammes School of Digital Society, Technology and AI Community JTS Themes Language and AI Calendar

Language and AI Colloquium I: 'Building and Evaluating Language Models: From Data to Benchmarks' by Bram Vanroy

When:Fr 11-10-2024 14:00 - 15:00
Where:Collaboratory A Harmonie Building 1313.0125

This initiative aims at bringing together all those interested or working on the intersections of natural language and Artificial Intelligence.

The Language and AI colloquia will take place from October 2024 to May 2025. Each colloquium will take place on Friday, it will start at 14:00 and will end at 15:00. For the interested people, it would be possible to arrange 1-on-1 meetings with the speakers.

Bram Vanroy

The speaker for the first Language and AI Colloquium is Bram Vanroy from KU Leuven with: 'Building and Evaluating Language Models: From Data to Benchmarks'.

'Large language models' (LLMs) and 'AI' are the buzzwords of the day, so much so that even "transformer" has made its way into the odd casual conversation. But what actually goes into these models? And how do we know if one model is truly better than the last?

This talk focuses on what happens before and after the training phase of LLMs: the craft of reliable dataset creation and assessing the model’s performance. We'll dive into the data pipelines responsible for creating high-quality datasets for the key stages of model development: pretraining (next-word prediction), supervised finetuning (chat/instruction), and preference tuning (alignment). We will discuss techniques such as web crawling, quality filtering, and synthetic data generation and scoring. Such data processing is gained more and more attention; after all, if we put garbage into a model, it will spit it back out. You’ll also learn how model performance is evaluated across a variety of benchmarks, from straightforward question-answering to assessments of "emotional intelligence" and crowd-sourced user evaluation.

Share this Facebook LinkedIn
Volg ons optwitter linkedin youtube