Emily Diana
University of Pennsylvania
ediana@wharton.upenn.edu
Bio
Emily Diana is a fourth-year doctoral student in Statistics at the Wharton School University of Pennsylvania. She is advised by Michael Kearns and Aaron Roth and researches ethical concerns in statistical learning theory particularly fairness in machine learning and private data analysis. Emily holds a B.A. in Applied Mathematics from Yale College and an M.S. in Statistics from Stanford University. Before graduate school she worked for two years as a scientific software developer at Lawrence Livermore National Laboratory and she currently maintains a research collaboration with Amazon Web Services. She is a Wharton Doctoral Programs peer mentor and serves as a student representative for the school’s PhD Executive Committee.
Multiaccurate Proxies for Downstream Fairness
Multiaccurate Proxies for Downstream Fairness
We study the problem of training a model that must obey demographic fairness conditions when the sensitive features are not available at training time — in other words how can we train a model to be fair by race when we don’t have data about race? We adopt a fairness pipeline perspective in which an “upstream” learner that does have access to the sensitive features will learn a proxy model for these features from the other attributes. The goal of the proxy is to allow a general “downstream” learner — with minimal assumptions on their prediction task — to be able to use the proxy to train a model that is fair with respect to the true sensitive features. We show that obeying multiaccuracy constraints with respect to the downstream model class suffices for this purpose and provide sample- and oracle efficient-algorithms and generalization bounds for learning such proxies. In general multiaccuracy can be much easier to satisfy than classification accuracy and can be satisfied even when the sensitive features are hard to predict.