MIT Rising Stars

Naomi P Saphra

Harvard University

Position: Research Fellow

Rising Stars year of participation: 2024

Contact:

nsaphra@fas.harvard.edu

Bio

Naomi Saphra is a research fellow at the Kempner Institute at Harvard University. She is interested in NLP training dynamics: how models learn to encode linguistic patterns or other structure and how we can encode useful inductive biases into the training process. Previously, she earned a PhD from the University of Edinburgh on Training Dynamics of Neural Language Models; worked at NYU, Google and Facebook; and attended Johns Hopkins and Carnegie Mellon University. Outside of research, she plays roller derby under the name Gaussian Retribution, performs standup comedy, and shepherds disabled programmers into the world of code dictation.

Areas of Research

Natural Language and Speech Processing

Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

Most interpretability research in NLP focuses on understanding the behavior and features of a fully trained model. However, certain insights into model behavior may only be accessible by observing the trajectory of the training process. We present a case study of syntax acquisition in masked language models (MLMs) that demonstrates how analyzing the evolution of interpretable artifacts throughout training deepens our understanding of emergent behavior. In particular, we study Syntactic Attention Structure (SAS), a naturally emerging property of MLMs wherein specific Transformer heads tend to focus on specific syntactic relations. We identify a brief window in pretraining when models abruptly acquire SAS, concurrent with a steep drop in loss. This breakthrough precipitates the subsequent acquisition of linguistic capabilities. We then examine the causal role of SAS by manipulating SAS during training, and demonstrate that SAS is necessary for the development of grammatical capabilities. We further find that SAS competes with other beneficial traits during training, and that briefly suppressing SAS improves model quality. These findings offer an interpretation of a real-world example of both simplicity bias and breakthrough training dynamics.

Elnaz Sadeghian

Georgia Institute of Technology

Detector for Two-Dimensional Magnetic Recording

Deblina Sarkar

MIT

2D Steep Transistor Technology: Overcoming Fundamental Barriers in Low-Power Electronics and Ultra-Sensitive Biosensors

Naomi P Saphra

Bio

Areas of Research

Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

Previous

Elnaz Sadeghian

Next

Deblina Sarkar