Christina Giannoula
University of Toronto
christina.giann@gmail.com
Bio
Christina Giannoula is a Postdoctoral Researcher at the University of Toronto working with Professors Gennady Pekhimenko, Andreas Moshovos, Nandita Vijaykumar and their research groups. She is also working with the SAFARI research group, and Prof. Onur Mutlu. She received her Ph.D. in October 2022 from the School of Electrical and Computer Engineering at the National Technical University of Athens advised by Professors Georgios Goumas, Nectarios Koziris and Onur Mutlu. Her research interests lie in the intersection of computer architecture, computer systems and high-performance computing. Specifically, her research focuses on the hardware/software co-design of emerging applications, including graph processing, pointer-chasing data structures, machine learning workloads, and sparse linear algebra, with modern computing paradigms, such as large-scale multicore systems, disaggregated memory systems and near-data processing architectures. She has several publications and awards for her research on the aforementioned topics. For more information, please see her webpage at https://cgiannoula.github.io/ .
Areas of Research
- Computer Architecture
A Handy Runtime Framework for Machine Learning Models on Real Processing-In-Memory Systems
Processing-In-Memory (PIM) computing alleviates data movement bottlenecks between memory and processors by performing computation inside the memory chips. In PIM architectures, processing units are tightly coupled with one or a few memory banks of DRAM devices, enabling low-latency memory access and immense aggregated memory bandwidth. PIM-enabled memory modules are connected to a Host system, i.e., a CPU or GPU, alongside standard memory modules. Recent studies demonstrate that PIM systems can significantly enhance performance and energy benefits in memory-intensive kernels of Machine Learning (ML) models, such as the aggregation layer of Graph Neural Networks, the pointwise and 1×1 convolutions in Convolutional Neural Networks, and the attention layers of Large Language Models. However, the primary challenge in fully leveraging PIM-related benefits is programmability. Software stacks for PIM systems are still in early stages, providing limited libraries for specific kernels. Therefore, programming PIM systems for memory-intensive ML kernels is a hard task, requiring ML programmers to use low-level programming interfaces and/or have deep knowledge of the PIM hardware. Our research project addresses this challenge by designing Pai, an end-to-end, easy-to-use ML runtime framework for PIM systems. Pai will include a high-level interface designed for programmers familiar with GPU programming, allowing easy deployment of efficient kernels for various PIM hardware, including UPMEM PIM and HBM-PIM systems. Pai will integrate a low-level compilation pass that transforms high-level code to optimized PIM-hardware-specific code, implementing optimizations such as caching and data parallelism. Additionally, Pai will incorporate a scheduler that optimizes kernel execution between Host and PIM cores, and a memory manager that efficiently handles data allocation, partitioning, and data movement between PIM-enabled and standard memory modules. The Pai project aims to bridge the gap between ML engineers and PIM architectures, and will be open-sourced to enable further research on optimizing ML models in memory-centric systems.