Judy Hoffman
UC Berkeley
Bio
Judy Hoffman is a Ph.D. candidate at UC Berkeley’s Computer Vision Group. She received her B.Sc. in Electrical Engineering and Computer Science from UC Berkeley in 2010. Her research lies at the intersection of computer vision, transfer learning, and machine learning: she is interested in minimizing the amount of human supervision needed to learn new visual recognition models. Judy was awarded the NSF Graduate Research Fellowship in 2010 and the Rosalie M. Stern Fellowship 2010. She was the co-president of the Women in Computer Science and Engineering at UC Berkeley 2012-2013, the outreach and diversity officer for the Computer Science Graduate Association 2013-2014, and organized the first workshop for Women in Computer Vision located at CVPR 2015.
Adapting Deep Visual Models for Visual Recognition in the Wild
Adapting Deep Visual Models for Visual Recognition in the Wild
Understanding visual scenes is a crucial piece in many artificial intelligence applications ranging from autonomous vehicles and household robotic navigation to automatic image captioning for the blind. Reliably extracting high-level semantic information from the visual world in real-time is key to solving these critical tasks safely and correctly. Existing approaches based on specialized recognition models are prohibitively expensive or intractable due to limitations in dataset collection and annotation. By facilitating learned information sharing between recognition models these applications can be solved; multiple tasks can regularize one another, redundant information can be reused, and the learning of novel tasks is both faster and easier.
My work focuses on transferring learned information quickly and reliably between visual data sources and across visual tasks–all with limited human supervision. I aim to both formally understand and empirically quantify the degree to which visual models can be adapted and provide algorithms to facilitate information transfer.
Most visual recognition systems learn concepts directly from a large collection of manually annotated images/videos. A model which detects pedestrians requires a human to manually go through thousands or millions of images and indicate all instances of pedestrians. However, this model is susceptible to biases in the labeled data and often fails to generalize to new scenarios — a detector trained in Palo Alto may have degraded performance in Rome, or a detector trained in sunny weather may fail in the snow. Rather than require human supervision for each new task or scenario, my work draws on deep learning, transformation learning, and convex-concave optimization to produce novel optimization frameworks which transfer information from the large curated databases to real world scenarios. This results in strong recognition models for novel tasks and paves the way towards scalable visual understanding.