Mengzhou Xia
Princeton University
mengzhou@princeton.edu
Bio
Mengzhou Xia is a final-year PhD student at Princeton University, advised by Prof. Danqi Chen. Her research focuses on developing efficient and high-performing language models by designing innovative algorithms within the constraints of an academic budget, targeting both model and data scalability issues. She earned her master’s degree from Carnegie Mellon University, where she collaborated with Prof. Graham Neubig, and completed her bachelor’s degree at Fudan University in China. Mengzhou is a recipient of the 2024 Apple Scholars in AI/ML PhD Fellowship and the 2022 Bloomberg Data Science PhD Fellowship. During her PhD, she has gained industry experience through internships at Meta AI, Microsoft Research, and Bloomberg AI.
Areas of Research
- Natural Language and Speech Processing
Pre-training and Aligning Language Models: Algorithmic Advances in Objectives and Data Curation
My research focuses on designing algorithms to build high-performing, efficient, and aligned open foundation models, emphasizing the following areas: Pre-training small-scale language models: To address the high cost of LLMs, I explore efficient methods for training competitive small models. Leveraging my work on CoFiPruning, we prune large models to targeted sizes using just 3% of the compute needed to pre-train from scratch. We released ShearedLLaMA, a series of small models (160M, 1.3B, 2.7B parameters) that achieved state-of-the-art performance and receive over 10k monthly downloads, demonstrating the impact of academic research on LLM pre-training with limited resources. Alignment through data curation: Data quality is crucial for alignment. We developed LESS, an algorithm to identify the most relevant data for specific capabilities (e.g., multilingual QA). Our work shows that a small subset (5%) can outperform the full dataset in instruction tuning, and data selected by small models transfers effectively to larger ones. LESS also identifies data that appears benign but can compromise safety, demonstrating its broad applicability. Efficient alignment objectives: Alignment with RLHF is complex and expensive, while DPO introduces unnecessary regularization. We propose SimPO, a reference-free alignment objective that improves instruction-following performance. Our model ranked 1st in AlpacaEval2, ArenaHard, and WildBench among similar-sized models (7-8B) at release. Additional Work: I also study LLM training trajectories, evaluate safety and transparency of large language models, design new architectures, and build challenging reasoning-intensive benchmarks. While LLMs excel at individual tasks, they often struggle with novel scenarios. To enhance core planning and reasoning, I will explore large-scale data synthesis, autonomous learning, and human-AI interactions to help develop the next generation of open models.