Maria Brbic

Stanford University

Position: Postdoctoral Researcher
Rising Stars year of participation: 2021
Bio

Maria Brbic is a postdoctoral researcher in computer science at Stanford University working with Prof. Jure Leskovec. Her research focuses on developing new machine learning methods for solving challenging problems in biology and biomedicine. She is particularly interested in methods that can generalize to new contexts and tasks given limited amounts of labeled data and applications to single-cell genomics. She is engaged in projects at Chan Zuckerberg Biohub and Stanford Neuro-omics Initiative. She received her PhD degree in 2019 from University of Zagreb Croatia. During her PhD she was a visiting student at University of Tokyo (2018) and Stanford University (2018/2019). She received a bachelorÂ’s degree and masterÂ’s degree in computer science from University of Zagreb. She was awarded with the Fulbright Scholarship L’Oreal UNESCO for Women in Science Scholarship Branimir Jernej award for outstanding publication in biology and biomedicine and the Silver Plaque Josip Loncar for best doctoral dissertation.

Bridging labeled and unlabeled data in biomedicine

Bridging labeled and unlabeled data in biomedicine
Machine learning methods have reached human-level performance on tasks with the abundance of large-scale labeled training data that can support learning of highly parameterized models. However in biomedical applications labeled datasets are difficult and costly to obtain due the tedious manual effort required for annotating datasets and our still limited knowledge of complex underlying biological mechanisms. Therefore we are faced with scarcely labeled datasets or even completely unlabeled datasets. To enable new biomedical discoveries we need algorithms that can generalize to novel never-before-seen classes derived from new biological contexts such as measurements in a new tissue species or disease state. We propose novel interpretable deep neural networks that bridge labeled and unlabeled data by learning to generalize across tasks given only a few labeled examples or extremely without any labeled data. Our methods are grounded in meta-learning a machine learning paradigm that supports learning of task-invariant latent representations by reusing labels of related tasks as auxiliary data. We apply our methods to discover never-before-seen cell types across heterogeneous single-cell experiments. We show the unique ability of our approach to transfer existing cell-type annotations to previously uncharacterized cell types and poorly annotated experiments across different tissues species and sequencing technologies. Our work demonstrates that cross-dataset transfer is not only possible but is a necessary component in developing next-generation algorithms that can discover new biological insights and solve central problems in biology.