Siyuan Guo

Max Planck Institute for Intelligent Systems

Position: PhD Candidate
Rising Stars year of participation: 2025
Bio

Siyuan Guo is a Ph.D. candidate in Machine Learning at the University of Cambridge, UK and Max Planck Institute for Intelligent Systems, Tuebingen, Germany (2021–2025), advised by Ferenc Huszar and Bernhard Schoelkopf. Her research focus on understanding foundations of learning and causality and build machine learning models for in-context causal inference—with results in 7 NeurIPS and ICLR publications with 1 oral and 2 spotlight presentations. She also spends time as a Research Scientist Intern at Meta FAIR, New York and received funding from G-Research Prize, Premium Research Studentships, Cambridge-Tuebingen Fellowships and MPI-IS Outstanding Female Doctoral Award.

Areas of Research
  • Artificial Intelligence
Foundations for Learning and Causality

Machine learning has privileged trial-and-error engineering and scale-first heuristics, in part because we still lack a principled understanding of when and why learning emerges, generalizes, and fails. Siyuan believes that intelligence is developed through the need to survival and efficient learning is the key to intelligence. Her research studies the problem of building an efficient learning system. Efficient learning processes information in the least time, i.e., building a system that reaches a desired error threshold with the least number of observations. Building upon least action principles from physics, her research derives classic learning algorithms, Bellman’s optimality equation in reinforcement learning, and the Adam optimizer in generative models from first principles, i.e., the Learning Lagrangian. Her research work — physics of learning — shows preliminary evidence and postulates that learning searches for stationary paths in the Lagrangian, and learning algorithms are derivable by seeking the stationary trajectories. Beyond understanding fundamentals of learning and intelligence, her research also works on machine learning for causal inference, that involves pre-training a tabular model on synthetic data for automatic in-context causal inference. This has intriguing downstream applications in healthcare and enterprise settings.