MIT Rising Stars

Yi Yang

Northwestern University

Position: Postdoctoral Fellow

Rising Stars year of participation: 2025

Contact:

yi.yang@northwestern.edu

Bio

Kunhe Yang is a Ph.D. candidate in the EECS department at UC Berkeley, where she is advised by Nika Haghtalab. She is broadly interested in the intersection of economics and computer science. She has been working on designing and evaluating AI systems in strategic and agentic environments, drawing on tools from machine learning, game theory, and economics.

Areas of Research

Materials and Devices

Mechanism-Informed Design of Robust Optoelectronic Materials and Devices

After pre-training, large language models are aligned with human preferences based on pairwise comparisons. State-of-the-art alignment methods (such as PPO-based RLHF and DPO) are built on the assumption of aligning with a single preference model, despite being deployed in settings where users have diverse preferences. As a result, it is not even clear that these alignment methods produce models that satisfy users on average a minimal requirement for
pluralistic alignment. Drawing on social choice theory and modeling users comparisons through
individual Bradley-Terry (BT) models, we introduce an alignment methods distortion: the
worst-case ratio between the optimal achievable average utility, and the average utility of the
learned policy.

The notion of distortion helps draw sharp distinctions between alignment methods: Nash
Learning from Human Feedback achieves the minimax optimal distortion of $(1/2+o(1))beta$ (for
the BT temperature $beta$), robustly across utility distributions, distributions of comparison pairs, and permissible KL divergences from the reference policy. RLHF and DPO, by contrast, suffer $ge(1-o(1))beta$ distortion already without a KL constraint, and $e^{Omega(beta)}$ or even unbounded distortion in the full setting, depending on how comparison pairs are sampled.

Kunhe Yang

University of California, Berkeley

Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?

Kang Yang

University of California, Los Angeles

AI-Enabled Wireless Networking, Sensing, and Optimization for Sustainable Systems

Yi Yang

Bio

Areas of Research

Mechanism-Informed Design of Robust Optoelectronic Materials and Devices

Previous

Kunhe Yang

Next

Kang Yang