Tianqi Tang
UC Santa Barbara
ttq1008@gmail.com
Bio
Tianqi Tang is currently a 5th-year PhD student at Electrical Computer Engineering department University of California Santa Barbara advised by Prof. Yuan Xie. She is a member of the Scalable Energy Efficiency Architecture Lab (SEAL). Previously she received her M.S. and B.S. degree in the Department of Electronic Engineering from Tsinghua University Beijing China. Her research interests lie in the broad fields of computer architecture non-volatile memory brain-inspired computing hardware acceleration and machine learning. Her PhD thesis focuses on hardware modeling and domain-specific accelerator design.
Hardware Modeling & Efficient Architectural Exploration for Machine Learning Accelerator Design
Hardware Modeling & Efficient Architectural Exploration for Machine Learning Accelerator Design
The innovation in computer architecture (CA) and the development of simulation tools are influencing each other mutually. With the booming of deep learning (DL) and artificial intelligence (AI) it drives the need for new tools for the new architectures and applications. Meanwhile DL models keep evolving. The popular DL models when an accelerator is shipped to the market may be quite different from the previous models when the accelerator was under design one or two years before. It drives the need to know how well the domain-specific accelerators can be adaptive to a broad spectrum of DL workloads with satisfying performance and high utilization at the early design stage. A primary goal of my research is to advance the DL accelerator design methodology with collaborative contributions on hardware modeling and cost analysis. Meanwhile my research also proposes the efficient architectures and numerical optimization methodologies that are applicable to the broad spectrum of the emerging DL models. My work develops NeuroMeter an integrated power area and timing modeling framework for ML accelerators. It enables the runtime analysis of system-level performance and efficiency when the runtime activity factors are provided at the pre-RTL design stage. On top of that and the additional package-level resistance-inductance-capacitance path analysis my work explores the inter-chiplet level (or package level) analysis on how DL workloads perform when the accelerator scales out from a monolithic 2D chip to a multi-chiplet based 2.5D System-on-Package (SoP) system.