MICRO 2025: Main Program

Saturday, October 18 / Sunday, October 19: Workshops & Tutorials

Sunday, 5:00 PM KST – 7:00 PM KST: SRC Poster Session

Sunday, 6:00 PM KST – 9:00 PM KST: Welcome Reception

Jump to Sat/Sun | Monday | Tuesday | Wednesday

Expand All / Collapse All Sessions

Day 1: Monday, October 20

8:00 AM KST – 8:30 AM KST: Opening Remarks

8:30 AM KST – 9:30 AM KST: Keynote 1 by Luis Ceze (NVIDIA and Univ. of Washington)

AI compilers and inference at scale: efficiency and velocity

Abstract
AI-based applications are being adopted faster than any other new technology in history. This results in extreme demand for compute to support inference at scale. As if that was not challenging enough, AI model architectures evolve at a breakneck pace, requiring systems infrastructure to support them with high velocity. The goal of AI compilers is to make AI workloads run efficiently in the underlying hardware with as little manual work as possible — at their heart is the use of AI techniques themselves. In this talk, I will talk about the origins and history of AI compilers, the current state of the art, and how AI-based codegen is pointing to a likely revolution in AI system software. I will end with thoughts about implications for computer systems architecture and HW-SW co-design.

Bio
Luis Ceze is Lazowska Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington, where he has been on the faculty since 2007. He is also VP of AI Systems Software at NVIDIA, following the acquisition of OctoAI, where he was co-founder and CEO. His current research focuses on scalable and efficient AI systems, as well as the intersection of computing and biology for IT applications. He is part of the UW SyFI (Systems for Future Intelligence lab), and the UW MISL (Molecular Information Systems Lab). He is a Sloan Research Fellow and ACM Fellow.

9:30 AM KST – 9:50 AM KST: Coffee Break

9:50 AM KST – 10:50 AM KST

Session 1A: Systems for AI (LLMs) - 1

Session Chair: Ramyad Hadidi (d-Matrix)

Stratum: System-Hardware Co-design with Tiered Monolithic 3D-DRAM for Efficient MoE Serving

Yue Pan, Zihan Xia (Univ. of California, San Diego); Po-Kai Hsu (Georgia Inst. of Technology); Lanxiang Hu (Univ. of California, San Diego); Hyungyo Kim (Univ. of Illinois Urbana-Champaign); Janak Sharda (Georgia Inst. of Technology); Minxuan Zhou (Illinois Inst. of Technology); Nam Sung Kim (Univ. of Illinois at Urbana Champaign); Shimeng Yu (Georgia Inst. of Technology); Tajana Rosing, Mingu Kang (Univ. of California, San Diego)

Kelle: Co-design KV Caching and eDRAM for Efficient LLM Serving in Edge Computing

Tianhua Xia, Sai Qian Zhang (New York Univ.)

LongSight: Compute-Enabled Memory to Accelerate Large-Context LLMs via Sparse Attention

Derrick Quinn, E. Ezgi Yücel, Jinkwon Kim, José F. Martínez, Mohammad Alian (Cornell Univ.)

Session 1B: Processing-In-Memory - 1

Session Chair: Daichi Fujiki (Inst. of Science Tokyo)

ComPASS: A Compatible PIM Protocol Architecture and Scheduling Solution for Processor-PIM Collaboration

Seunghyuk Yu, Hyeonu Kim, Kyoungho Jeun, Sunyoung Hwang, Seongmin Cho, Eojin Lee (Inha Univ.)

PIM-CCA: An Efficient PIM Architecture with Optimized Integration of Configurable Functional Units

Jeehyun Kim, Donghyeon Kim, Seokwon Kang (Yonsei Univ.); Bongjoon Hyun (KAIST); Inho Lee (Hanyang Univ.); Yongjun Park (Yonsei Univ.)

3D-PATH: A Hierarchy LUT Processing-in-memory Accelerator with Thermal-aware Hybrid Bonding Integration

Zhiheng Yue, Yang Wang (Tsinghua Univ.); Chao Li (Shanghai Jiao Tong Univ.); Shaojun Wei, Yang Hu, Shouyi Yin (Tsinghua Univ.)

Session 1C: Security and Privacy - Side Channels

Session Chair: Shuwen Deng (Tsinghua Univ.)

One Flew over the Stack Engine’s Nest: Practical Microarchitectural Attacks on the Stack Engine

Silvan Niederer, Sandro Rüegge, Ali Hajiabadi, Kaveh Razavi (ETH Zürich)

DExiM: Exposing Impedance-Based Data Leakage in Emerging Memories

Md Sadik Awal, Md Tauhidur Rahman (Florida International Univ.)

Sonar: A Fuzzing Framework to Uncover Contention Side Channels in Processors

Kanqi Zhang, Peinan Li, Miao Li, Xin Tian, Zelong Du, Quachen Liu (Inst. of Information Engineering, Chinese Academy of Sciences); Yongqiang Lyu (Tsinghua Univ., Beijing, China); Yu Jiang (Tsinghua university); Dan Meng, Rui Hou (Inst. of Information Engineering, CAS)

Session 1D: Microarchitecture - Prefetching 1

Session Chair: Heiner Litz (Univ. of California, Santa Cruz)

Symbiotic Task Scheduling and Data Prefetching

Gilead Posluns, Mark Jeffrey (Univ. of Toronto)

Software Prefetch Multicast: Sharer-Exposed Prefetching for Bandwidth Efficiency in Manycore Processors

Yanhua Chen, Jiong Feng (The Hong Kong Univ. of Science and Technology (Guangzhou)); Zhe Wang (Intel Labs); Christopher J. Hughes (Intel); Jiayi Huang (HKUST(GZ))

RICH Prefetcher: Storing Rich Information in Memory to Trade Capacity and Bandwidth for Latency Hiding

Ningzhi Ai (Huawei Technologies Co., Ltd, Tsinghua Univ.); Wenjian He (Huawei Technologies Co., Ltd); Hu He (Tsinghua Univ.); Jing Xia, Heng Liao, Guowei Zhang (Huawei Technologies Co., Ltd)

10:50 AM KST – 11:10 AM KST: Coffee Break

11:10 AM KST – 12:30 PM KST

Session 2A: Systems for AI (LLMs) - 2

Session Chair: Yunho Oh (Korea Univ.)

DECA: A Near-Core LLM Decompression Accelerator Grounded on a 3D Roofline Model

Gerasimos Gerogiannis (Intel, Univ. of Illinois at Urbana-Champaign); Stijn Eyerman (Intel); Evangelos Georganas (Intel Labs); Wim Heirman (Intel); Josep Torrellas (Univ. of Illinois Urbana-Champaign)

StreamTensor: Make Tensors Stream in Dataflow Accelerators for LLMs

Hanchen Ye, Deming Chen (Inspirit IoT, Inc. and Univ. of Illinois Urbana-Champaign)

Chameleon: Adaptive Caching and Scheduling for Many-Adapter LLM Inference Environments

Nikoleta Iliakopoulou, Jovan Stojkovic, Chloe Alverti, Tianyin Xu (Univ. of Illinois Urbana-Champaign); Hubertus Franke (IBM Research); Josep Torrellas (Univ. of Illinois Urbana-Champaign)

Coruscant: Co-Designing GPU Kernel and Sparse Tensor Core to Advocate Unstructured Sparsity in Efficient LLM Inference

Donghyeon Joo, Helya Hosseini (Univ. of Maryland, College Park); Ramyad Hadidi (d-Matrix); Bahar Asgari (Univ. of Maryland, College Park)

Session 2B: Processing-In-Memory - 2

Session Chair: Mohammad Alian (Cornell Univ.)

Accelerating Retrieval Augmented Language Model via PIM and PNM Integration

Je-Woo Jang, Junyong Oh, Youngbae Kong, Jae-Youn Hong, Sung-Hyuk Cho, Jeongyeol Lee (Yonsei Univ.); Hoeseok Yang (Santa Clara Univ.); Joon-Sung Yang (Yonsei Univ.)

HEAT: NPU-NDP HEterogeneous Architecture for Transformer-Empowered Graph Neural Networks

Ruiyang Chen, Zhuoran Song, Yicheng Zheng (Shanghai Jiao Tong Univ.); Zeyu Zhu, Gang Li (Inst. of Automation, Chinese Academy of Sciences); Naifeng Jing, Xiaoyao Liang, Haibing Guan (Shanghai Jiao Tong Univ.)

RayN: Ray Tracing Acceleration with Near-memory Computing

Mohammadreza Saed, Prashant J. Nair, Tor M. Aamodt (Univ. of British Columbia)

Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving

Wonung Kim, Yubin Lee, Yoonsung Kim, Jinwoo Hwang, Seongryong Oh, Jiyong Jung, Aziz Huseynov, Woong Gyu Park (KAIST); Chang Hyun Park (Uppsala Univ.); Divya Mahajan (Georgia Inst. of Technology); Jongse Park (KAIST)

Session 2C: Security and Privacy - Machine Learning

Session Chair: Meng Li (Peking Univ.)

GateBleed: A Timing-Only Membership Inference Attack, MoE-Routing Inference, and a Stealthy, Generic Magnifier Via Hardware Power Gating in AI Accelerators

Joshua Kalyanapu, Farshad Dizani, Darsh Asher, Azam Ghanbari (North Carolina State Univ.); Rosario Cammarota (Intel); Aydin Aysu, Samira Mirbagher (North Carolina State Univ.)

Athena: Accelerating Quantized Convolutional Neural Networks under Fully Homomorphic Encryption

Yinghao Yang, Xicheng Xu (Inst. of Computing Technology, Chinese Academy of Sciences); Liang Chang (Univ. of Electronic Science and Technology of China); Hang Lu, Xiaowei Li (Inst. of Computing Technology, Chinese Academy of Sciences)

ccAI: A Compatible and Confidential System for AI Computing

Chenxu Wang (Southern Univ. of Science and Technology (SUSTech) and The Hong Kong Polytechnic Univ.); Danqing Tang, Changxu Ci (Ant Group); Junjie Huang, Yankai Xu, Fengwei Zhang (Southern Univ. of Science and Technology (SUSTech)); Jiannong Cao (The Hong Kong Polytechnic Univ.); Jie Song, Shoumeng Yan, Tao Wei, Zhengyu He (Ant Group)

Ironman: Accelerating Oblivious Transfer Extension for Privacy-Preserving AI with Near-Memory Processing

Chenqi Lin (School of Software & Microelectronics, Peking Univ.); Kang Yang (State Key Laboratory of Cryptology); Tianshi Xu (Peking Univ.); Ling Liang (DAMO Academy Alibaba Group, Peking Univ.); Yufei Wang (Alibaba Group); Zhaohui Chen (Computing Technology Lab, Alibaba Group); Runsheng Wang (Peking Univ.); Mingyu Gao (Tsinghua Univ.); Meng Li (Peking Univ.)

Session 2D: GPU - 1

Session Chair: Jiayi Huang (HKUST(GZ))

Dissecting and Modeling the Architecture of Modern GPU Cores

Rodrigo Huerta, Mojtaba Abaie Shoushtary, José-Lorenzo Cruz, Antonio Gonzalez (Universitat Politecnica de Catalunya)

Interleaved Bitstream Execution for Multi-Pattern Regex Matching on GPUs

Tianao Ge (Hong Kong Univ. of Science and Technology (Guangzhou)); Xiaowen Chu (Data Science and Analytics Thrust, HKUST(GZ)); Hongyuan Liu (Stevens Inst. of Technology)

SoftWalker: Supporting Software Page Table Walk for Irregular GPU Applications

Sungbin Jang, Junhyeok Park, Yongho Lee, Osang Kwon, Donghyun Kim, Juyoung Seok, Seokin Hong (Sungkyunkwan Univ.)

LATPC: Accelerating GPU Address Translation Using Locality-Aware TLB Prefetching and MSHR Compression

Yeonan Ha, Jiho Park, Hanna Cha (Yonsei Univ.); Jiwon Lee (Yonsei Univ. / Samsung Electronics); Joonsung Kim (Sungkyunkwan Univ.); Won Woo Ro, Youngsok Kim (Yonsei Univ.)

12:30 PM KST – 2:00 PM KST: Lunch

SRC Forum runs from 1:20 PM to 3:20 PM.

2:00 PM KST – 3:20 PM KST

Session 3A: Systems for AI (Emerging Applications)

Session Chair: Mingu Kang (Univ. of California, San Diego)

S-DMA: Sparse Diffusion Models Acceleration via Spatiality-Aware Prediction and Dimension-Adaptive Dataflow

Zihan Zou, Xinming Yan, Shun Zhang, Peng Zheng, Guang Yang, Hao Cai, Bo Liu (Southeast Univ.)

LLM.265: Video Codecs are Secretly Tensor Codecs

Best Paper Award

Ceyu Xu, Yongji Wu (Duke Univ.); Xinyu Yang, Beidi Chen (Carnegie Mellon Univ.); Matthew Lentz, Danyang Zhuo, Lisa Wu Wills (Duke Univ.)

HLX: A Unified Pipelined Architecture for Optimized Performance of Hybrid Transformer-Mamba Language Models

In-Jun Jung, Gyeongrok Yang, Jaeha Min, Joo-Young Kim (KAIST)

ORCHES: Orchestrated Test-Time-Computation-based LLM Reasoning on Collaborative GPU-PIM HEterogeneous System

Sixu Li, Yuzhou Chen, Chaojian Li, Yonggan Fu, Zheng Wang, Zhongzhi Yu, Haoran You, Zhifan Ye, Wei Zhou, Yongan Zhang, Yingyan (Celine) Lin (Georgia Inst. of Technology)

Session 3B: Microarchitecture I

Session Chair: Leeor Peled (Huawei)

LoopFrog: In-Core Hint-Based Loop Parallelization

Marton Erdos, Utpal Bora, Akshay Bhosale (Univ. of Cambridge); Bob Lytton, Ali Zaidi (Arm); Alexandra W Chadwick, Yuxin Guo (Univ. of Cambridge); Giacomo Gabrielli (Arm); Timothy M Jones (Univ. of Cambridge)

Multi-Stream Squash Reuse for Control-Independent Processors

Qingxuan Kang, Trevor E. Carlson (National Univ. of Singapore)

Drishti: Do Not Forget Slicing While Designing Last-Level Cache Replacement Policies for Many-Core Systems

Best Paper Candidate

Sweta, Prerna Priyadarshini, Biswabandan Panda (Indian Inst. of Technology Bombay)

A TRRIP Down Memory Lane: Temperature-Based Re-Reference Interval Prediction For Instruction Caching

Henry Kao, Nikhil Sreekumar, Prabhdeep Singh Soni, Ali Sedaghati (Huawei Technologies Canada); Fang Su (Huawei); Bryan Chan (Huawei Technologies Canada); Maziar Goudarzi (Huawei Technologies 2012 Labs, Canada Research Center); Reza Azimi (Huawei)

Session 3C: Quantum - 1

Session Chair: Dongmoon Min (Sungkyunkwan Univ.)

LANCER: Low-Overhead, Accurate, and Non-Destructive Calibration for Real-World Fault-Tolerant Quantum Applications

Junpyo Kim, Jungmin Cho, Hyeonseong Jeong (Seoul National Univ.); Dongmoon Min (Sungkyunkwan Univ.); Junhyuk Choi, Juwon Hong, Jangwoo Kim (Seoul National Univ.)

Distributed-HISQ: A Distributed Quantum Control Architecture

Yilun Zhao (Inst. of Computing Technology, Chinese Academy of Sciences); Kangding Zhao (College of Computer Science and Technology, National Univ. of Defense Technology); Peng Zhou (China Greatwall Quantum Laboratory); Dingdong Liu (College of Computer Science and Technology, National Univ. of Defense Technology); Tingyu Luo (East China Normal Univ.); Yuzhen Zheng, Peng Luo, Shun Hu (College of Computer Science and Technology, National Univ. of Defense Technology); Jin Lin (Hefei National Laboratory, Univ. of Science and Technology of China); Cheng Guo (Hefei National Laboratory); Yinhe Han (ICT, Chinese Academy of Sciences); Ying Wang (Inst. of Computing Technology, Chinese Academy of Sciences); Mingtang Deng, Junjie Wu, Xiang Fu (College of Computer Science and Technology, National Univ. of Defense Technology)

Accurate Leakage Speculation for Quantum Error Correction

Chaithanya Naik Mude, Swamit Tannu (Univ. of Wisconsin-Madison)

YOUTIAO: Hybrid Multiplexing with Dynamic Qubit Grouping for Low-cost and Scalable Quantum Wiring

Wuwei Tian, Liqiang Lu, Siwei Tan, Shiyu Li, Hengyi Li, Tianyao Chu, Xuhong Zhang, Mingshuai Chen, Jianwei Yin (Zhejiang Univ.)

Session 3D: SRC Forum

Session Chair: SRC Org.

3:20 PM KST – 3:40 PM KST: Coffee Break

3:40 PM KST – 5:00 PM KST

Session 4A: Systems for AI (Training)

Session Chair: Jinho Lee (Seoul National Univ.)

NetZIP: Algorithm/Hardware Co-design of In-network Lossless Compression for Distributed Large Model Training

Jinghan Huang, Hyungyo Kim, Nachuan Wang, Jaeyoung Kang, Hrishi Shah (Univ. of Illinois Urbana-Champaign); Eun Kyung Lee (IBM Research); Minjia Zhang, Fan Lai, Nam Sung Kim (Univ. of Illinois Urbana-Champaign)

Characterizing the Efficiency of Distributed Training: A Power, Performance, and Thermal Perspective

Seokjin Go, Joongun Park, Spandan More, Hanjiang Wu, Irene Wang, Aaron Jezghani, Tushar Krishna, Divya Mahajan (Georgia Inst. of Technology)

SkipReduce: (Interconnection) Network Sparsity to Accelerate Distributed Machine Learning

Hans Kasan (KAIST); Dennis Abts (NVIDIA); Jungwook Choi (Hanyang Univ.); John Kim (KAIST)

Optimizing All-to-All Collective Communication with Fault Tolerance on Torus Networks

Le Qin, Junwei Cui, Weilin Cai (The Hong Kong Univ. of Science and Technology (Guangzhou)); Meng Niu, Yan Yang (Huawei); Jiayi Huang (HKUST(GZ))

Session 4B: Microarchitecture II

Session Chair: Lisa Wu Wills (Duke Univ.)

Titan-I: An Open-Source, High Performance RISC-V Vector Core

Jiuyang Liu (Huazhong Univ. of Science and Technology); Qinjun Li (Inst. of Software, Chinese Academy of Sciences); Yunqian Luo (Tsinghua Univ.); Hongbin Zhang (Inst. of Software, Chinese Academy of Sciences; Univ. of Chinese Academy of Sciences); Jiongjia Lu (Henan Academy of Sciences); Shupei Fan (Tsinghua Univ.); Jianhao Ye (Univ. of Chinese Academy of Sciences); Xiaoyi Liu, Ao Shen (Tsinghua Univ.); Yang Liu (Inst. of Software, Chinese Academy of Sciences); Yanqi Yang (Huazhong Univ. of Science and Technology); Zewen Ye (Zhejiang Univ.); Yuhang Zeng, Rui Huang (Wuhan Xinpian Technology Co., Ltd.); Mingyu Gao (Tsinghua Univ.); Xuecheng Zou (Henan Academy of Sciences); Wei Cong (Nanjing UCUN Technology Inc)

SHADOW: Simultaneous Multi-Threading Architecture with Asymmetric Threads

Ishita Chaturvedi (Princeton Univ.); Bhargav Reddy Godala (AheadComputing); Abiram Gangavaram, Daniel Flyer (Princeton Univ.); Tyler Sorensen (UC Santa Cruz and Microsoft); Tor M. Aamodt (Univ. of British Columbia); David I. August (Princeton Univ.)

ATR: Out-of-Order Register Release Exploiting Atomic Regions

Yinyuan Zhao, Surim Oh, Mingsheng Xu, Heiner Litz (Univ. of California, Santa Cruz)

Session 4C: Quantum - 2

Session Chair: Teruo Tanimoto (Kyushu Univ.)

Vegapunk: Accurate and Fast Decoding for Quantum LDPC Codes with Online Hierarchical Algorithm and Sparse Accelerator

Kaiwen Zhou, Liqiang Lu, Debin Xiang, Chenning Tao (Zhejiang Univ.); Anbang Wu, Jingwen Leng, Fangxin Liu (Shanghai Jiao Tong Univ.); Mingshuai Chen, Jianwei Yin (Zhejiang Univ.)

Resource-adaptive Compilation of Photonic One-way Quantum Computing

Hezi Zhang, Jixuan Ruan, Dean Tullsen, Yufei Ding (Univ. of California San Diego); Ang Li (PNNL and UW); Travis Humble (Quantum Science Center, Oak Ridge National Laboratory)

MUSS-TI: Multi-level Shuttle Scheduling for Large-Scale Entanglement Module Linked Trapped-Ion

Xian Wu, Chenghong Zhu (The Hong Kong Univ. of Science and Technology (Guangzhou)); Jingbo Wang (Beijing Academy of Quantum Information Sciences); Xin Wang (The Hong Kong Univ. of Science and Technology (Guangzhou))

Rasengan: A Transition Hamiltonian-based Approximation Algorithm for Solving Constrained Binary Optimization Problems

Qifan Jiang, Liqiang Lu, Debin Xiang, Tianyao Chu, Tianze Zhu (Zhejiang Univ.); Jingwen Leng (Shanghai Jiao Tong Univ.); Yun (Eric) Liang (Peking Univ.); Xiaoming Sun (Inst. of Computing Technology, Chinese Academy of Sciences); Jianwei Yin (Zhejiang Univ.)

Session 4D: Sparsity - 1

Session Chair: Ranggi Hwang (UNIST)

Chasoň: Supporting Cross HBM Channel Data Migration to Enable Efficient Sparse Algebraic Acceleration

Ubaid Bakhtiar, Amirmahdi Namjoo (Univ. of Maryland-College Park); Bahar Asgari (Univ. of Maryland, College Park)

A Probabilistic Perspective on Tiling Sparse Tensor Algebra

Ritvik Sharma (Stanford Univ.); Fisher Xue (Massachusetts Inst. of Technology); Nathan Zhang, Rubens Lacouture, Fredrik Kjolstad, Sara Achour, Mark Horowitz (Stanford Univ.)

Boötes: Boosting the Efficiency of Sparse Accelerators Using Spectral Clustering

Sanjali Yadav (Univ. of Maryland); Bahar Asgari (Univ. of Maryland, College Park)

Misam: Machine Learning Assisted Dataflow Selection in Accelerators for Sparse Matrix Multiplication

Sanjali Yadav (Univ. of Maryland); Amirmehdi Namjoo (Univ. of Maryland-College Park); Bahar Asgari (Univ. of Maryland, College Park)

5:00 PM KST – 6:00 PM KST: Poster Session & Job Candidate Showcase session

Job Candidate Showcase session (PhD Forum): link

6:00 PM KST – 7:30 PM KST: Business Meeting

Jump to Sat/Sun | Monday | Tuesday | Wednesday

Expand All / Collapse All Sessions

Day 2: Tuesday, October 21

8:00 AM KST – 9:00 AM KST: Keynote 2 by Onur Mutlu (ETH Zürich)

Can We Do Better?

Abstract
This talk will critically examine various aspects of computing system design and the research & development process that goes into it, from the speaker’s perspective. We aim to deconstruct some assumptions that go into various design, research, and development processes, with the goal of enabling and hopefully inspiring better and fundamentally more efficient ways of doing things. A key assumption we will deconstruct is the processor-centric design mindset and paradigm of thinking about and designing computing systems, which is hitting many scaling limits. We aim to also examine other issues, including how we analyze and vet scientific research and its broader implications.

Bio
Onur Mutlu is a Professor of Computer Science at ETH Zurich. He previously held the William D. and Nancy W. Strecker Early Career Professorship at Carnegie Mellon University. His research interests are in computer architecture, computing systems, hardware security, memory & storage systems, and bioinformatics, with a major focus on designing fundamentally energy-efficient, high-performance, and robust computing systems. Many techniques he, with his group and collaborators, has invented over the years have largely influenced industry and have been widely employed in commercial microprocessors and memory & storage systems used daily by billions of people. He obtained his PhD and MS in ECE from the University of Texas at Austin and BS degrees in Computer Engineering and Psychology from the University of Michigan, Ann Arbor. He started the Computer Architecture Group at Microsoft Research (2006-2009), and held product, research and visiting positions at Intel Corporation, Advanced Micro Devices, VMware, Google, and Stanford University. He received various honors for his impactful research, including the 2025 IEEE Computer Society Harry H. Goode Memorial Award “for seminal contributions to computer architecture research and practice, especially in memory systems,” 2024 IFIP Jean-Claude Laprie Award in Dependable Computing (for the original RowHammer work), 2021 IEEE High Performance Computer Architecture Conference Test of Time Award (for the Runahead Execution work), 2022 Persistent Impact Prize of the Non-Volatile Memory Systems Workshop (for the leading architectural work on Phase Change Memory), 2025 Dependable Systems and Networks Conference Test of Time Award, 2023 Huawei OlympusMons Award in Storage Systems, 2019 ACM SIGARCH Maurice Wilkes Award, and dozens of best paper or “Top Pick” paper recognitions at various leading computer systems, architecture, and security venues. He is an ACM Fellow, IEEE Fellow, and an elected member of the Academy of Europe. He enjoys teaching, mentoring, and enabling broad global access to high-quality research and education. He has supervised 24 PhD graduates, many of whom received major dissertation & other awards, 15 postdoctoral trainees, and more than 60 Master’s and Bachelor’s students. His computer architecture and digital logic design course lectures and materials are freely available on YouTube (also here), and his research group makes a wide variety of open-source artifacts freely available online. For more information, please see his webpage at https://people.inf.ethz.ch/omutlu/.

9:00 AM KST – 9:20 AM KST: Coffee Break

9:20 AM KST – 10:20 AM KST

Session 5A: Systems for AI (Quantization)

Session Chair: Baris Kasikci (Univ. of Washington)

AxCore: A Quantization-Aware Approximate GEMM Unit for LLM Inference

Jiaxiang Zou, Yonghao Chen, Xingyu Chen, Chenxi Xu, Xinyu Chen (The Hong Kong Univ. of Science and Technology (Guangzhou))

Amove: Accelerating LLMs through Mitigating Outliers and Salient Points via Fine-Grained Grouped Vectorized Data Type

Xilong Xie, Liang Wang, Limin Xiao (Beihang Univ.); Meng Han (Tsinghua Univ.); Lei Liu, Xiangrong Xu, Jinquan Wang, Zhen Song, Xiaojian Liao (Beihang Univ.)

MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving

Jungi Lee, Junyong Park, Soohyun Cha, Jaehoon Cho, Jaewoong Sim (Seoul National Univ.)

Session 5B: Microarchitecture - Prefetching 2

Session Chair: Biswabandan Panda (Indian Inst. of Technology Bombay)

Micro-MAMA: Multi-Agent Reinforcement Learning for Multicore Prefetching

Charles Block, Gerasimos Gerogiannis, Josep Torrellas (Univ. of Illinois Urbana-Champaign)

Ghost Threading: Helper-Thread Prefetching for Real Systems

Yuxin Guo, Alexandra W. Chadwick, Márton Erdös, Utpal Bora, Akshay Bhosale (Univ. of Cambridge); Giacomo Gabrielli (Arm); Timothy M. Jones (Univ. of Cambridge)

Elevating Temporal Prefetching Through Instruction Correlation

Shuiyi He, Zicong Wang, Xuan Tang, Hao Tang (National Univ. of Defense Technology); Dezun Dong (NUDT); Liquan Xiao (National Univ. of Defense Technology)

Session 5C: Sparsity - 2

Session Chair: Bahar Asgari (Univ. of Maryland, College Park)

Quartz: A Reconfigurable, Distributed Memory Accelerator for Sparse Applications

Courtney Golden, Axel Feldmann (Massachusetts Inst. of Technology); Joel Emer (MIT/NVIDIA); Daniel Sanchez (Massachusetts Inst. of Technology)

SeaCache: Efficient and Adaptive Caching for Sparse Accelerators

Xintong Li, Jinchen Jiang, Mingyu Gao (Tsinghua Univ.)

NetSparse: In-Network Acceleration of Distributed Sparse Kernels

Gerasimos Gerogiannis, Dimitrios Merkouriadis, Charles Block (Univ. of Illinois Urbana-Champaign); Annus Zulfiqar (Univ. of Michigan); Filippos Tofalos (Univ. of Illinois Urbana-Champaign); Muhammad Shahbaz (Univ. of Michigan); Josep Torrellas (Univ. of Illinois Urbana-Champaign)

Session 5D: Superconducting Systems

Session Chair: Hunjun Lee (Hanyang Univ.)

ColumnDisturb: Understanding Column-based Read Disturbance in Real DRAM Chips and Implications for Future Systems

Ismail Emir Yuksel, Ataberk Olgun, Nisa Bostanci, Haocong Luo, Abdullah Giray Yaglikci, Onur Mutlu (ETH Zürich)

SuperSFQ: A Hardware Design to Realize High-Frequency Superconducting Processors

Junhyuk Choi, Juwon Hong, Junpyo Kim, Jungmin Cho, Hyeonseong Jeong (Seoul National Univ.); Dongmoon Min (Sungkyunkwan Univ.); Masamitsu Tanaka (Nagoya Univ.); Koji Inoue (Kyushu Univ.); Jangwoo Kim (Seoul National Univ.)

Characterizing and Optimizing Realistic Workloads on a Commercial Compute-in-SRAM Device

Niansong Zhang (Cornell Univ.); Wenbo Zhu (Univ. of Southern California); Courtney Golden (Massachusetts Inst. of Technology); Dan Ilan (GSI Inc.); Hongzheng Chen, Christopher Batten, Zhiru Zhang (Cornell Univ.)

10:20 AM KST – 10:40 AM KST: Coffee Break

10:40 AM KST – 12:00 PM KST

Session 6A: GPUs - 2

Session Chair: Gunjae Koo (Korea Univ.)

C3ache: Towards Hierarchical Cache-Centric Computing for Sparse Matrix Multiplication on GPGPUs

Xiaojie Li (Sun Yat-sen Univ.); Mingyu Wang, Baiqing Zhong, Haiqiu Huang (Sun Yat-Sen Univ.); Guangjie Cao, Zhiyi Yu (Sun Yat-sen Univ.)

Leveraging Chiplet-Locality for Efficient Memory Mapping in Multi-Chip Module GPUs

Junhyeok Park, Sungbin Jang, Osang Kwon, Yongho Lee, Seokin Hong (Sungkyunkwan Univ.)

Security and Performance Implications of GPU Cache Eviction Priority Hints

Qizhong Wang, Xiangyue Huang (Univ. of California, Santa Cruz); Yanan Guo (Univ. of Rochester); Yuanchao Xu (Univ. of California, Santa Cruz)

Session 6B: Security and Privacy - Memory

Session Chair: Ali Hajiabadi (ETH Zürich)

COSMOS: RL-Enhanced Locality-Aware Counter Cache Optimization for Secure Memory

Haoran Geng (Univ. of Notre Dame); Xiaoyang Lu (Illinois Inst. of Technology); Yuezhi Che, Ziang Tian (Wuhan Univ.); Dazhao Cheng (WuHan Univ.); Xian-He Sun (Illinois Inst. of Technology); Michael Niemier, X. Sharon Hu (Univ. of Notre Dame)

CryptoBTB: A Secure Hierarchical BTB for Diverse Instruction Footprint Workloads

Debpratim Adak, Eric Rotenberg (North Carolina State Univ.); Amro Awad (Univ. of Oxford); Huiyang Zhou (North Carolina State Univ.)

Efficient Security Support for CXL Memory through Adaptive Incremental Offloaded (Re-)Encryption

Chuanhan Li (UC-Santa Cruz); Jishen Zhao (Univ. of California, San Diego); Yuanchao Xu (Univ. of California, Santa Cruz)

Citadel: Rethinking Memory Allocation to Safeguard Against Inter-Domain Rowhammer Exploits

Anish Saxena, Walter Wang, Alexandros Daglis (Georgia Inst. of Technology)

Session 6C: Energy and Power

Session Chair: Esha Choukse (Microsoft)

EcoCore: Dynamic Core Management for Improving Energy Efficiency in Latency-Critical Applications

Gyeongseo Park (ETRI); Minho Kim (DGIST); Ki-Dong Kang (ETRI); Yunhyeong Jeon, Seulki Kim (DGIST); Daehoon Kim (Yonsei Univ.)

Flexing RISC-V Instruction Subset Processors to Extreme Edge

Best Paper Award

Alireza Raisiardali (Pragmatic Semiconductor / KU Leuven); Konstantinos Iordanou, Jedrzej Kufel, Kowshik Gudimetla (Pragmatic Semiconductor); Kris Myny (KU Leuven); Emre Ozer (Pragmatic Semiconductor)

ReGate: Enabling Power Gating in Neural Processing Units

Yuqi Xue, Jian Huang (Univ. of Illinois Urbana-Champaign)

Multi-Dimensional ML-Pipeline Optimization in Cost-Effective Disaggregated Datacenter

Pingyi Huo, Anusha Devulapally (The Pennsylvania State Univ.); Hasan Al Maruf (META, Inc); Nandhini Chandramoorthy (IBM, Inc); Meena Arunachalam (AMD, Inc); Gulsum Gudukbay Akbulut, Mahmut T Kandemir, Vijaykrishnan Narayanan (The Pennsylvania State Univ.)

Session 6D: Reconfigurable Computing and Storage

Session Chair: Jisung Park (POSTECH)

CrossBit: Bitwise Computing in NAND Flash Memory with Inter-Bitline Data Communication

Hyunjin Kim (Seoul National Univ.); Seunghwan Song (Samsung Electronics); Sukhyun Choi (Seoul National Univ.); Jeongin Choe (Samsung Electronics); Sanghyeok Han (Seoul National Univ.); Jisung Park (POSTECH (Pohang Univ. of Science and Technology)); Jinho Lee, Jae-Joon Kim (Seoul National Univ.)

DEAR: Improving Performance and Lifetime of SSDs Using Dynamic Error-Aware Refresh

Jaeyong Lee (Seoul National Univ.); Beomjun Kim (Kyungpook National Univ.); Myoungjun Chun (Seoul National Univ.); Myungsuk Kim (Kyungpook National Univ.); Jihong Kim (Seoul National Univ.)

Nexus Machine: An Energy-Efficient Active Message Inspired Reconfigurable Architecture

Rohan Juneja (National Univ. of Singapore); Pranav Dangi (National Univ. Of Singapore); Thilini Kaushalya Bandara, Tulika Mitra, Li-shiuan Peh (National Univ. of Singapore)

FexMo: Enabling Fuse Execution Mode for Multi-task CGRAs

Yufei Yang (Harbin Inst. of Technology); Chenhao Xie (Beihang Univ.); Chuliang Guo, Liansheng Liu, Yu Peng, Xiyuan Peng, Datong Liu (Harbin Inst. of Technology)

12:00 PM KST – 1:30 PM KST: Awards Lunch

1:30 PM KST – 3:10 PM KST

Session 7A: Systems for AI (HW/SW Support)

Session Chair: Jaewoong Sim (Seoul National Univ.)

Crane: Inter-Layer Scheduling Framework for DNN Inference and Training Co-Support on Tiled Architecture

Yu Gong (Rutgers Univ.); Haodong Chang (Texas A&M Univ.); Lingyi Huang (Rutgers Univ.); Rongjian Liang (NVIDIA); Cheng Yang, Zhexiang Tang (Rutgers Univ.); Jiang Hu (Texas A&M Univ.); Bo Yuan (Rutgers Univ.)

OASIS: A Commercial High Performance Terminal AI Processor Supporting RISC-V Tensor Extension Instructions

Peng Gao (Beijing Univ. of Posts and Telecommunications, SOPHGO TECHNOLOGIES LTD.); Yang Liu, Haonan Sun (Beijing Univ. of Posts and Telecommunications); Jiang Jiang, Jun Wang, Zonghui Hong, Jiali Qu (SOPHGO TECHNOLOGIES LTD.)

ELK: Exploring the Efficiency of Inter-core Connected AI Chips with Deep Learning Compiler Techniques

Yiqi Liu, Yuqi Xue, Noelle Crawford (Univ. of Illinois Urbana-Champaign); Jilong Xue (Microsoft Research); Jian Huang (Univ. of Illinois Urbana-Champaign)

Empowering Vector Architectures for ML: The CAMP Architecture for Matrix Multiplication

Mohammadreza Esmali Nojehdeh (Barcelona supercomputing center); Hossein Mokhtarnia, Julian Pavon Rivera, Narcis Rodas Quiroga, Roger Figueras Bagué, Enrico Reggiani (Barcelona Supercomputing Center); Miquel Moreto (UPC/BSC); Osman Unsal, Adrian Cristal (BSC); Eduard Ayguade (Universitat Politecnica de Catalunya & Barcelona Supercomputing Center, Barcelona, Spain)

TAIDL: Tensor Accelerator ISA Definition Language with Auto-generation of Scalable Test Oracles

Devansh Jain, Marco Frigo, Jai Arora, Akash Pardeshi, Zhihao Wang, Krut Patel, Charith Mendis (Univ. of Illinois Urbana-Champaign)

Session 7B: Tools and Simulators

Session Chair: Jose Joao (Arm)

Simulating Hardware with C Speed and RTL Accuracy for High-Level Synthesis Designs

Rishov Sarkar, Cong Hao, Cong (Callie) Hao (Georgia Inst. of Technology)

LEGOSim: A Unified Parallel Simulation Framework for Multi-chiplet Heterogeneous Integration

Tiantian Lin (Zhejiang Univ.); Cheng Qiu (South China Univ. of Technology); Xiaohang Wang (Zhejiang Univ.); Ling Wang (The Univ. of Western Australia); Zhulin Zheng (Zhejiang Univ.); Yingtao Jiang (Univ. of Nevada, Las Vegas); Amit Kumar Singh (Univ. of Essex); Jieming Yin (Nanjing Univ. of Posts and Telecommunications); Sihai Qiu (Beijing Smart-chip Microelectronics Technology Co., Ltd,); Xiaodong Li, Xin Tang, Jie Song, Mingzhe Zhang (Ant Group); Kui Ren (Zhejiang Univ.)

PyTorchSim: A Comprehensive, Fast, and Accurate NPU Simulation Framework

Wonhyuk Yang, Yunseon Shin, Okkyun Woo, Geonwoo Park, Hyungkyu Ham (POSTECH); Jeehoon Kang (KAIST / FuriosaAI); Jongse Park (KAIST); Gwangsun Kim (POSTECH / Arm)

LLMulator: Generalizable Cost Modeling for Dataflow Accelerators with Input-Adaptive Control Flow

Kaiyan Chang, Wenlong Zhu, Shengwen Liang, Huawei Li, Ying Wang (SKLP, Inst. of Computing Technology, Chinese Academy of Sciences)

Swift and Trustworthy Large-Scale GPU Simulation with Fine-Grained Error Modeling and Hierarchical Clustering

Euijun Chung, Seonjin Na, Sung Ha Kang, Hyesoon Kim (Georgia Inst. of Technology)

Session 7C: Reliability, Fault-tolerance

Session Chair: Jungrae Kim (Sungkyunkwan Univ.)

Understanding and Mitigating Covert and Side Channel Vulnerabilities Introduced by RowHammer Defenses

Nisa Bostanci (ETH Zürich); Oguzhan Canpolat (TOBB ETÜ and ETH Zurich); Ataberk Olgun, Ismail Emir Yuksel, Konstantinos Kanellopoulos, Mohammad Sadrosadati, Abdullah Giray Yaglikci, Onur Mutlu (ETH Zürich)

ρHammer: Reviving RowHammer Attacks on New Architectures via Prefetching

Weijie Chen, Shan Tang, Yulin Tang (Huazhong Univ. of Science and Technology); Xiapu Luo (The Hong Kong Polytechnic Univ.); Yinqian Zhang (Southern Univ. of Science and Technology); Weizhong Qiang (Huazhong Univ. of Science and Technology)

DRAM Fault Classification through Large-Scale Field Monitoring for Robust Memory RAS Management

Best Paper Candidate

Hoeju Chung, Euisang Oh, Seungmin Baek, Hyeongshin Yoon, Jaesung Yoo, Sanghwan Lee (SK hynix America); Yongjun Lee, Arhatha Bramhanand, Brett Dodds (Microsoft); Yang Zhou, Nam Sung Kim (Univ. of Illinois Urbana-Champaign)

DiffTest-H: Toward Semantic-Aware Communication in Hardware-Accelerated Processor Verification

Kunlin You, Yinan Xu (SKLP, Inst. of Computing Technology, Chinese Academy of Sciences); Kehan Feng (Beijing Inst. of Open Source Chip); Luoshan Cai (SKLP, Inst. of Computing Technology, Chinese Academy of Sciences); Yaoyang Zhou (Beijing Inst. of Open Source Chip); Yungang Bao (SKLP, Inst. of Computing Technology, Chinese Academy of Sciences)

SymbFuzz: Symbolic Execution Guided Hardware Fuzzing

Samit Shahnawaz Miftah, Amisha Srivastava (Univ. of Texas at Dallas); Hyunmin Kim (Technology Innovation Inst.); Shiyi Wei, Kanad Basu (Univ. of Texas at Dallas)

Session 7D: Graph processing and HPC

Session Chair: Sang-Woo Jun (UC Irvine)

TransFusion: End-to-End Transformer Acceleration via Graph Fusion and Pipelining

Linxuan Zhang, J. Nelson Amaral, Di Niu (Univ. of Alberta)

X-SET: An Efficient Graph Pattern Matching Accelerator With Order-Aware Parallel Intersection Units

Chenxi Xu (The Hong Kong Univ. of Science and Technology (Guangzhou)); Tianhui SHI (QingCheng.AI); Shixuan Sun (Shanghai Jiao Tong Univ.); JIDONG ZHAI (TSINGHUA UNIVERSITY); Xinyu Chen (The Hong Kong Univ. of Science and Technology (Guangzhou))

FALA: Locality-Aware PIM-Host Cooperation for Graph Processing with Fine-Grained Column Access

Changmin Shin, Jaeyong Song, Seongmin Na, Jun Sung, Hongsun Jang, Jinho Lee (Seoul National Univ.)

Rethinking Tiling and Dataflow for SpMM Acceleration: A Graph Transformation Framework

Amir Ghazizadeh Ahsaei, Lingxiang Yin, Shilin Tian, Fangzhou Ye, Fan Yao, Hao Zheng (Univ. of Central Florida)

Boosting Task Scheduling Data Locality with Low-latency, HW-accelerated Label Propagation

Lucas Morais, Juan Miguel de Haro Ruiz (Barcelona Supercomputing Center and Universitat Politecnica de Catalunya); Alfredo Goldman (Univ. of São Paulo); Guido Araujo (Univ. of Campinas); Giacomo Pedretti, Jim Ignowski (HPE); Michael Frank (MagiCore); Xavier Martorell, Daniel Jiménez-González, Carlos Álvarez (Barcelona Supercomputing Center and Universitat Politecnica de Catalunya)

3:10 PM KST – 3:30 PM KST: Coffee Break

3:30 PM KST – 4:30 PM KST: Special Panel

AI Demon Hunters

Moderator: Gabriel Loh, AMD Research

Description
The research community is already focused on the "big" bottlenecks and challenges for AI, such as TOPS, power, memory bandwidth and capacity, TCO, etc. However, as AI systems continue to scale, other (perhaps less obvious) challenges and issues could significantly stifle the widespread and positive impactful usage of AI across society. Our esteemed panelists will share their thoughts and perspectives on these lurking "AI Demons." This panel will provide discussion and debate, along with audience Q&A, that aims to share insights and help guide the research agenda for the MICRO community.

Panelists

Trevor Carlson, Professor (National University of Singapore)
Deming Chen, Professor (University of Illinois at Champaign-Urbana)
Esha Choukse, Principal Researcher (Microsoft Azure Research)
Daehyun Kim, Executive VP (Samsung Research)
Hoshik Kim, Sr. VP and Fellow (SK Hynix)
Hyesoon Kim, Professor (Georgia Institute of Technology)

4:30 PM KST – 9:00 PM KST: Excursion, Banquet

Buses depart starting at 4:30 PM
5:00 PM to 7:00 PM for Excursion
7:00 PM to 9:00 PM for Dinner

Jump to Sat/Sun | Monday | Tuesday | Wednesday

Expand All / Collapse All Sessions

Day 3: Wednesday, October 22

8:30 AM KST – 9:30 AM KST

Session 8A: Systems for AI (Data Representations)

Session Chair: Pantea Zardoshti (Microsoft)

BitL: A Hybrid Bit-Serial and Parallel Deep Learning Accelerator for Critical Path Reduction

Seunghyun Lee, Dongho Ha, Sungbin Kim, Sungwoo Kim (Yonsei Univ.); Hyunwuk Lee (Samsung Electronics); Won Woo Ro (Yonsei Univ.)

HiPACK: Efficient Sub-8-Bit Direct Convolution with SIMD and Bitwise Management

Yao CHEN (National Univ. of Singapore); Cheng Gong (Tiangong Univ.); Bingsheng He (National Univ. of Singapore)

MCBP: A Memory-Compute Efficient LLM Inference Accelerator Leveraging Bit-Slice-enabled Sparsity and Repetitiveness

Huizheng Wang, Zichuan Wang, Zhiheng Yue, Yousheng Long, Taiquan Wei, Jianxun Yang, Yang Wang (Tsinghua Univ.); Chao Li (Shanghai Jiao Tong Univ.); Shaojun Wei, Yang Hu, Shouyi Yin (Tsinghua Univ.)

Session 8B: Systems for AI (Processor architecture)

Session Chair: Gwangsun Kim (POSTECH)

PolymorPIC: Embedding Polymorphic Processing-in-Cache in RISC-V based Processor for Full-stack Efficient AI Inference

Cheng Zou, Ziling Wei, Lee Jun Yan, Chen Nie, Kang You, Zhezhi He (Shanghai Jiao Tong Univ.)

MHE-TPE: Multi-Operand High-Radix Encoder for Mixed-Precision Fixed-Point Tensor Processing Engines

Qizhe Wu, Jinyi Zhou, Zhanhe Hu (Univ. of Science and Technology of China); Zhichen Zeng (Univ. of Washington); Huawen Liang, Jiuru Zhu, Linfeng Tao (Univ. of Science and Technology of China); Xin Zhang (China Univ. of Mining and Technology); Zekang Cheng (Univ. of science and technology of China); Letian Zhao, Wei Yuan, Xiaotian Wang, Xi Jin (Univ. of Science and Technology of China)

SuperMesh: Energy-Efficient Collective Communications for Accelerators

Sabuj Laskar, Pranati Majhi, Abdullah Muzahid, Eun Jung Kim (Texas A&M Univ.)

Session 8C: Emerging Applications - 1

Session Chair: Hyokeun Lee (Ajou Univ.)

SMX: Heterogeneous Architecture for Universal Sequence Alignment Acceleration

Max Doblas Font (Barcelona Supercomputing Center); Po Jui Shih (Cornell Univ.); Oscar Lostes-Cazorla (Barcelona Supercomputing Center, Universitat Politecnica de Catalunya); Miquel Moreto (UPC/BSC); Christopher Batten (Cornell Univ.); Santiago Marco-Sola (Universitat Politecnica de Catalunya - Barcelona Supercomputing Center)

MINDFUL: Safe, Implantable, Large-Scale Brain-Computer Interfaces from a System-Level Design Perspective

Guy Eichler, Yatin Gilhotra, Nanyu Zeng, Martha Kim, Kenneth Shepard, Luca Carloni (Columbia Univ.)

DS-TIDE: Harnessing Dynamical Systems for Efficient Time-Independent Differential Equation Solving

Chuan Liu, Chunshu Wu, Ruibing Song, Guangyan Sun (Univ. of Rochester); Ying Nian Wu (Univ. of California, Los Angeles); Yousu Chen (Pacific Northwest National Laboratory); Ang Li (PNNL and UW); Tong Geng (Univ. of Rochester / Rice Univ.)

9:30 AM KST – 10:00 AM KST: Coffee Break

10:00 AM KST – 11:20 AM KST

Session 9A: Security and Privacy - Cryptography, Speculation and Computational Storage

Session Chair: Mingzhe Zhang (Ant Research)

Towards Closing the Performance Gap for Cryptographic Kernels Between CPUs and Specialized Hardware

Naifeng Zhang, Sophia Fu, Franz Franchetti (Carnegie Mellon Univ.)

HAWK: Fully Homomorphic Encryption Accelerator with Fixed-Word Key Decomposition Switching

Liang Kong (Ant Group); Shengyu Fan, Xianglong Deng (Chinese Academy of Sciences, CAS); Guang Fan, Lei Chen (Ant Group); Guiming Shi (Tsinghua Univ.); Yilan Zhu, Geng Yang, Shoumeng Yan, Mingzhe Zhang (Ant Group)

ShadowBinding: Realizing Effective Microarchitectures for In-Core Secure Speculation Schemes

Amund Bergland Kvalsvik, Magnus Själander (Norwegian Univ. of Science and Technology)

SmartPIR: A Private Information Retrieval System using Computational Storage Devices

Zehao Chen, Honghui You, Qian Wei (Shandong Univ.); Hang Lu (Inst. of Computing Technology, Chinese Academy of Sciences); Zhaoyan Shen, Lei Ju (Shandong Univ.)

Session 9B: Memory

Session Chair: Chang Hyun Park (Uppsala Univ.)

Beyond Page Migration: Enhancing Tiered Memory Performance via Integrated Last-Level Cache Management and Page Migration

Best Paper Candidate

Hwanjun Lee, Minho Kim, Yeji Jung, Seonmu Oh (DGIST); Ki-Dong Kang (ETRI); Seunghak Lee (DGIST/Samsung Electronics); Daehoon Kim (Yonsei Univ.)

Learning to Walk: Architecting Learned Virtual Memory Translation

Kaiyang Zhao, Yuang Chen, Xenia Xu (Carnegie Mellon Univ.); Dan Schatzberg (Meta); Nastaran Hajinaza, Rupin Vakharwala, Andy Anderson (Intel); Dimitrios Skarlatos (Carnegie Mellon Univ.)

Delegato: Locality-Aware Atomic Memory Operations on Chiplets

Best Paper Candidate

Víctor Soria-Pardos (Barcelona Supercomputing Center (BSC)); Adrià Armejach (Universitat Politecnica de Catalunya (UPC) & Barcelona Supercomputing Center (BSC)); Tiago Mück (Arm); Darío Suárez (Universidad de Zaragoza); Jose Joao (Arm); Miquel Moretó (Universitat Politecnica de Catalunya (UPC) & Barcelona Supercomputing Center (BSC))

Re-architecting End-host Networking with CXL: Coherence, Memory, and Offloading

Houxiang Ji (Univ. of Illinois Urbana Champaign); Yifan Yuan (Meta); Yang Zhou (Univ. of Illinois Urbana Champaign); Ipoom Jeong (Yonsei Univ.); Ren Wang (Intel); Saksham Agarwal, Nam Sung Kim (Univ. of Illinois Urbana-Champaign)

Session 9C: Emerging Applications - 2

Session Chair: Mark Jeffrey (Univ. of Toronto)

GCC: A 3DGS Inference Architecture with Gaussian-Wise and Cross-Stage Conditional Processing

MINNAN PEI, Gang Li, Junwen Si, Zeyu Zhu, Zitao Mo, Peisong Wang (Inst. of Automation, Chinese Academy of Sciences); Zhuoran Song, Xiaoyao Liang (Shanghai Jiao Tong Univ.); Jian Cheng (Inst. of Automation, Chinese Academy of Sciences)

RTGS: Real-Time 3D Gaussian Splatting SLAM via Multi-Level Redundancy Reduction

Leshu LI, Jiayin Qin (Univ. of Minnesota, Twin Cities); Jie Peng (Univ. of North Carolina at Chapel Hill); Zishen Wan (Georgia Inst. of Technology); Huaizhi Qu (Univ. of North Carolina at Chapel Hill); Ye Han (Vanderbilt Univ.); Pingqing Zheng, Hongsen Zhang (Univ. of Minnesota, Twin Cities); Yu Cao (Univ. of Minnesota Twin Cities); Tianlong Chen (Univ. of North Carolina at Chapel Hill); Yang (Katie) Zhao (Univ. of Minnesota, Twin Cities)

REACT3D: Real-time Edge Accelerator for Incremental Training in 3D Gaussian Splatting based SLAM Systems

Hongyi Wang (Tsinghua Univ.); Zhenhua Zhu (Tsinghua Univ.; HKUST); Tianchen Zhao, Yunfei Xiang, Zehao Wang, Jincheng Yu, Huazhong Yang (Tsinghua Univ.); Yuan Xie (HKUST); Yu Wang (Tsinghua Univ.)

PointISA: ISA-Extensions for Efficient Point Cloud Analytics via Architecture and Algorithm Co-Design

Meng Han (Tsinghua Univ.); Liang Wang, Limin Xiao, Hao Zhang, Bowen Jiang, Xilong Xie (Beihang Univ.); Jianfeng Zhu, Shaojun Wei, Leibo Liu (Tsinghua Univ.)

11:30 AM KST – 11:50 AM KST: Closing Remarks

Jump to Sat/Sun | Monday | Tuesday | Wednesday

Expand All / Collapse All Sessions

MICRO 2025

October 18 – October 22, 2025

Main Program

Sunday, 5:00 PM KST – 7:00 PM KST: SRC Poster Session

Sunday, 6:00 PM KST – 9:00 PM KST: Welcome Reception

Day 1: Monday, October 20

8:00 AM KST – 8:30 AM KST: Opening Remarks

8:30 AM KST – 9:30 AM KST: Keynote 1 by Luis Ceze (NVIDIA and Univ. of Washington)

9:30 AM KST – 9:50 AM KST: Coffee Break

9:50 AM KST – 10:50 AM KST

10:50 AM KST – 11:10 AM KST: Coffee Break

11:10 AM KST – 12:30 PM KST

12:30 PM KST – 2:00 PM KST: Lunch

2:00 PM KST – 3:20 PM KST

3:20 PM KST – 3:40 PM KST: Coffee Break

3:40 PM KST – 5:00 PM KST

5:00 PM KST – 6:00 PM KST: Poster Session & Job Candidate Showcase session

6:00 PM KST – 7:30 PM KST: Business Meeting

Day 2: Tuesday, October 21

8:00 AM KST – 9:00 AM KST: Keynote 2 by Onur Mutlu (ETH Zürich)

9:00 AM KST – 9:20 AM KST: Coffee Break

9:20 AM KST – 10:20 AM KST

10:20 AM KST – 10:40 AM KST: Coffee Break

10:40 AM KST – 12:00 PM KST

12:00 PM KST – 1:30 PM KST: Awards Lunch

1:30 PM KST – 3:10 PM KST

3:10 PM KST – 3:30 PM KST: Coffee Break

3:30 PM KST – 4:30 PM KST: Special Panel

4:30 PM KST – 9:00 PM KST: Excursion, Banquet

Day 3: Wednesday, October 22

8:30 AM KST – 9:30 AM KST

9:30 AM KST – 10:00 AM KST: Coffee Break

10:00 AM KST – 11:20 AM KST

11:30 AM KST – 11:50 AM KST: Closing Remarks