MIT Rising Stars

Yating Wu

The University of Texas at Austin

Position: PhD Candidate

Rising Stars year of participation: 2025

Contact:

yating.wu@utexas.edu

Bio

Yating Wu is a Ph.D. student in the Department of Electrical and Computer Engineering at the University of Texas at Austin, co-advised by Prof. Jessy Li and Prof. Alex Dimakis. Her research lies in natural language processing and computational linguistics, with a focus on using questions to help language models better evaluate, organize, and generate text. She studies discourse relationships such as Questions Under Discussion and their applications to improving text accessibility through elaboration. Yating’s recent research experiences include continuous pre-training, fine-tuning, and evaluating large language models in scientific domains, as well as improving the memory and abilities of LLM agents. She received an Outstanding Paper Award at EMNLP 2024. Prior to UT Austin, she obtained a B.E. degree in Computer Science and a B.A. degree in Japanese from Dalian University of Technology.

Areas of Research

Natural Language and Speech Processing

Question-Based Representations for Reliable and Adaptable Language Models

Large language Models (LLMs) have changed how we generate and interact with text, while it’s still challenging problem to evaluate, adapt, and control LLMs. My research develops question-based representations as a way to guide evaluation, data selection, and supervision in language models. One part of my work focuses on evaluation. Standard metrics often capture local accuracy but miss higher-level qualities like coherence and factual consistency. I design methods that use questions to represent how ideas are structured, making it possible to measure discourse-level organization and text quality in a more interpretable way. I also work on improving data quality. Training on massive raw corpora means models often learn from noisy or irrelevant signals. Question-based representations help highlight the most informative parts of text and organize document-level signals, improving training, generalization, and robustness. Finally, I study how to make supervision more flexible. Instead of relying on rigid discourse trees, I build parsers that represent dependencies through questions, offering a more scalable and adaptable way to guide model behavior for tasks like generation and data construction. Looking forward, I aim to extend these ideas to long context models for compression, memory, and reasoning. My long term goal is to make LLMs more interpretable, efficient, and adaptable, and to apply these methods in domains such as education, accessibility, and scientific communication.

Jane Wu

UC Berkeley

Perception for Physical Reasoning and Interaction

Yixin Wu

CISPA Helmholtz Center for Information Security

Emerging Data-Centric Risks Across the AI Lifecycle

Yating Wu

Bio

Areas of Research

Question-Based Representations for Reliable and Adaptable Language Models

Previous

Jane Wu

Next

Yixin Wu