Computer Science Lecture Series

Understanding Language Models through Discovery and by Design

John Hewitt (Stanford University)

Thursday, Feb 29, 2024

Whereas we understand technologies like airplanes or microprocessors well enough to fix them when they break, our tools for fixing modern language models are coarse. This is because, despite language models' increasing ubiquity and utility, we understand little about how they work. In this talk, I will present two lines of research for developing a deep, actionable understanding of language models that allows us to discover how they work, and fix them when they fail. In the first line, I will present structural probing methods for discovering the learned structure of language models, finding evidence that models learn structure like linguistic syntax. In the second line, I will show how we can understand complex models by design: through the new Backpack neural architecture, which gives us precise tools for fixing models.

Speaker Bio

John Hewitt is a PhD student in Computer Science at Stanford University, working with Percy Liang and Christopher Manning on discovering the learned structure of neural language models, and designing them to be more understandable, diagnosable, and fixable. He was an NSF Graduate Fellow, and received a B.S.E in Computer and Information Science from the University of Pennsylvania. John has received an Outstanding Paper Award at ACL 2023, a Best Paper Runner Up at EMNLP 2019, an Honorable Mention for Best Paper at the Robustness of Few-Shot Learning in Foundation Models Workshop (R0-FoMo)@NeurIPS 2023, and an Outstanding Paper Award at the Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP)@EMNLP 2020.


Sham Kakade and Milind Tambe


Ester Ramirez