Viewing Conference Content

Papers are available in the program below. You can view paper presentations and live streams on Whova, either by downloading the app on your phone or tablet, or using the web app interface. See the symposium format & guidelines for information on how to participate. Note that Whova content is available only to people who register for the conference.
Time Zone
Jump to Today



Day 1: Tuesday, October 19

9:45 AM EDT10:00 AM EDT

16:45 (EEST/Athens)
Welcome Message from the General Chair
Dimitris Gizopoulos (National and Kapodistrian University of Athens)

Welcome Message from the Program Co-Chairs
Aamer Jaleel (NVIDIA); Jishen Zhao (University of California, San Diego)

10:00 AM EDT11:00 AM EDT: Keynote by Michael T. Clark (AMD)

17:00 (EEST/Athens)

Mike Clark headshot Abstract
In this talk, I will discuss important factors in designing a high-performance processor core, giving a historical perspective of where the industry has been and highlighting some of the attributes of our latest Zen3 core.


Bio
Mike Clark is an AMD Corporate Fellow and is the Chief Architect for x86 cores. He graduated from the University of Illinois in 1993 with a B.S. in computer engineering and started work on the K5 processor at AMD. He served as chief architect of the Zen core and has contributed to every x86 processor from K5 to the latest Zen generation. A co-architect of the AMD64 ISA as well as the AMD-V virtualization extension, he has 28 patents on these and other technologies. He also received a master's degree in computer engineering from the University of Texas in 2003 while working at AMD.

11:00 AM EDT11:15 AM EDT: Break (socialize on GatherTown)

18:00 (EEST/Athens)

11:15 AM EDT12:15 PM EDT

18:15 (EEST/Athens)
Session Chair: Chris Wilkerson (Intel)
Best Paper Nominee
11:15 AM EDT11:30 AM EDT
APOLLO: An Automated Power Modeling Framework for Runtime Power Introspection in High-Volume Commercial Microprocessors
Zhiyao Xie (Duke University); Xiaoqing Xu, Matt Walker, Joshua Knebel, Kumaraguru Palaniswamy, Nicolas Hebert (ARM Ltd.); Jiang Hu (Texas A&M University); Huanrui Yang, Yiran Chen (Duke University); Shidhartha Das (ARM Ltd.)

Best Paper Nominee
11:30 AM EDT11:45 AM EDT
TIP: Time-Proportional Instruction Profiling
Bjorn Gottschall (Norwegian University of Science and Technology); Lieven Eeckhout (Ghent University); Magnus Jahre (Norwegian University of Science and Technology)

Best Paper Nominee
11:45 AM EDT12:00 PM EDT
NDS: N-Dimensional Storage
Yu-Chia Liu, Hung-Wei Tseng (University of California, Riverside)

Best Paper Nominee
12:00 PM EDT12:15 PM EDT
GPS: A Global Publish Subscribe Model for Multi-GPU Memory Management
Harini Muthukrishnan (University of Michigan); Daniel Lustig, David Nellans (NVIDIA); Thomas Wenisch (University of Michigan)

12:15 PM EDT1:30 PM EDT

19:15 (EEST/Athens)
Session Chair: Xing Hu (ICT, CAS)
12:15 PM EDT12:30 PM EDT
ParaBit: Processing Parallel Bitwise Operations in NAND Flash Memory Based SSDs
Congming Gao (Tsinghua University); Xin Xin (University of Pittsburgh); Youyou Lu (Tsinghua University); Youtao Zhang, Jun Yang (University of Pittsburgh); Jiwu Shu (Tsinghua University)

12:30 PM EDT12:45 PM EDT
Distributed Data Persistency
Apostolos Kokolis, Antonis Psistakis, Benjamin Reidys, Jian Huang, Josep Torrellas (University of Illinois Urbana-Champaign)

12:45 PM EDT1:00 PM EDT
COSPlay: Leveraging Task-Level Parallelism for High-Throughput Synchronous Persistence
Marina Vemmou, Alexandros Daglis (Georgia Institute of Technology)

1:00 PM EDT1:15 PM EDT
RACER: Bit-Pipelined Processing Using Resistive Memory
Minh S. Q. Truong, Eric Chen, Deanyone Su, Liting Shen, Alexander Glass, L. Richard Carley, James A. Bain (Carnegie Mellon University); Saugata Ghose (University of Illinois Urbana-Champaign)

1:15 PM EDT1:30 PM EDT
LADDER: Architecting Content and Location-Aware Writes for Crossbar Resistive Memories
Md Hafizul Islam Chowdhuryy, Muhammad Rashedul Haq Rashed (University of Central Florida); Amro Awad (North Carolina State University); Rickard Ewetz, Fan Yao (University of Central Florida)
Session Chair: Emre Ozer (Arm)
12:15 PM EDT12:30 PM EDT
GreenDIMM: OS-Assisted DRAM Power Management for DRAM with a Sub-Array Granularity Power-Down State
Seunghak Lee, Ki-Dong Kang, Hwanjun Lee, Hyungwon Park (DGIST); Younghoon Son (Samsung Electronics); Nam Sung Kim (University of Illinois Urbana-Champaign); Daehoon Kim (DGIST)

12:30 PM EDT12:45 PM EDT
NMAP: Power Management Based on Network Packet Processing Mode Transition for Latency-Critical Workloads
Ki-Dong Kang, Gyeongseo Park, Hyosang Kim (DGIST); Mohammad Alian (University of Kansas); Nam Sung Kim (University of Illinois Urbana-Champaign / Samsung); Daehoon Kim (DGIST)

12:45 PM EDT1:00 PM EDT
BurstLink: Techniques for Energy-Efficient Conventional and Virtual Reality Video Display
Jawad Haj-Yahya, Jisung Park, Rahul Bera, Juan Gomez Luna (ETH Zurich); Efraim Rotem (Intel); Taha Shahroodi (TU Delft); Jeremie Kim, Onur Mutlu (ETH Zurich)

1:00 PM EDT1:15 PM EDT
ReplayCache: Enabling Volatile Caches for Energy Harvesting Systems
Jianping Zeng, Jongouk Choi (Purdue University); Xinwei Fu (Virginia Tech); Ajay Paddayuru Shreepathi, Dongyoon Lee (Stony Brook University); Changwoo Min (Virginia Tech); Changhee Jung (Purdue University)

1:15 PM EDT1:30 PM EDT
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning
Young Geun Kim (Soongsil University); Carole-Jean Wu (Arizona State University)

1:30 PM EDT1:45 PM EDT: Break (socialize on GatherTown)

20:30 (EEST/Athens)

1:45 PM EDT3:00 PM EDT

20:45 (EEST/Athens)
Session Chair: Jakub Szefer (Yale)
1:45 PM EDT2:00 PM EDT
IceClave: A Trusted Execution Environment for In-Storage Computing
Luyi Kang (University of Maryland, College Park); Yuqi Xue, Weiwei Jia, Xiaohao Wang (University of Illinois Urbana-Champaign); Jongryool Kim, Changhwan Youn, Myeong Joon Kang, Hyung Jin Lim (SK Hynix); Bruce Jacob (University of Maryland, College Park); Jian Huang (University of Illinois Urbana-Champaign)

2:00 PM EDT2:15 PM EDT
DarKnight: An Accelerated Framework for Privacy and Integrity Preserving Deep Learning Using Trusted Hardware
Hanieh Hashemi, Yongqin Wang, Murali Annavaram (University of Southern California)

2:15 PM EDT2:30 PM EDT
2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency
Yonggan Fu, Yang Zhao, Qixuan Yu, Chaojian Li, Yingyan Lin (Rice University)

2:30 PM EDT2:45 PM EDT
F1: A Fast and Programmable Accelerator for Fully Homomorphic Encryption
Nikola Samardzic, Axel Feldmann, Aleksandar Krastev, Srinivas Devadas (MIT); Ronald Dreslinski, Christopher Peikert (University of Michigan); Daniel Sanchez (MIT)

2:45 PM EDT3:00 PM EDT
Cryptographic Capability Computing
Michael LeMay, Joydeep Rakshit, Sergej Deutsch, David M. Durham, Santosh Ghosh, Anant Nori, Jayesh Gaur, Andrew Weiler, Salmin Sultana, Karanvir Grewal, Sreenivas Subramoney (Intel Labs)
Session Chair: Alaa Alameldeen (Simon Fraser)
1:45 PM EDT2:00 PM EDT
TRiM: Enhancing Processor-Memory Interfaces with Scalable Tensor Reduction in Memory
Jaehyun Park, Byeongho Kim, Sungmin Yun (Seoul National University); Eojin Lee (Inha University); Minsoo Rhu (KAIST); Jung Ho Ahn (Seoul National University);

2:00 PM EDT2:15 PM EDT
SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems
Maciej Besta (ETH Zurich); Raghavendra Kanakagiri (IIT Tirupati); Grzegorz Kwasniewski (ETH Zurich); Rachata Ausavarungnirun (King Mongkut's University of Technology North Bangkok); Jakub Beránek (Technical University of Ostrava); Konstantinos Kanellopoulos (ETH Zurich); Kacper Janda (AGH-UST); Zur Vonarburg-Shmaria, Lukas Gianinazzi, Ioana Stefan, Juan Gómez Luna, Jakub Golinowski, Marcin Copik, Lukas Kapp-Schwoerer, Salvatore Di Girolamo, Nils Blach (ETH Zurich); Marek Konieczny (AGH-UST); Onur Mutlu, Torsten Hoefler (ETH Zurich)

2:15 PM EDT2:30 PM EDT
OrderLight: Lightweight Memory-Ordering Primitive for Efficient Fine-Grained PIM Computations
Anirban Nag (Uppsala University); Rajeev Balasubramonian (University of Utah)

2:30 PM EDT2:45 PM EDT
Sunder: Enabling Low-Overhead and Scalable Near-Data Pattern Matching Acceleration
Elaheh Sadredini (University of California, Riverside); Reza Rahimi (University of Virginia); Mohsen Imani (University of California, Irvine); Kevin Skadron (University of Virginia)

2:45 PM EDT3:00 PM EDT
SAM: Accelerating Strided Memory Accesses
Xin Xin, Yanan Guo (University of Pittsburgh); Youtao Zhang (Computer Science Department, University of Pittsburgh); Jun Yang (University of Pittsburgh)

3:00 PM EDT4:15 PM EDT

22:00 (EEST/Athens)
Session Chair: Yatin Manerkar (Michigan / Berkeley)
3:00 PM EDT3:15 PM EDT
Efficient, Distributed, Non-Speculative Multi-Address Atomic Operations
Eduardo Jose Gomez-Hernandez, Juan M. Cebrian, Ruben Titos-Gil (University of Murcia); Stefanos Kaxiras (Uppsala University); Alberto Ros (University of Murcia)

3:15 PM EDT3:30 PM EDT
Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCs
Joseph Zuckerman, Davide Giri, Jihye Kwon, Paolo Mantovani, Luca P. Carloni (Columbia University)

3:30 PM EDT3:45 PM EDT
Fat Loads: Exploiting Locality Amongst Contemporaneous Load Operations to Optimize Cache Accesses
Vanshika Baoni, Adarsh Mittal (UW Madison); Gurindar S. Sohi (University of Wisconsin-Madison)

3:45 PM EDT4:00 PM EDT
Criticality Driven Fetch
Aniket Deshmukh, Yale N. Patt (UT Austin)

4:00 PM EDT4:15 PM EDT
Software-Defined Vector Processing on Manycore Fabrics
Philip Bedoukian, Neil Adit, Edwin Peguero, Adrian Sampson (Cornell University)
Session Chair: Saugata Ghose (Illinois)
3:00 PM EDT3:15 PM EDT
Cerebros: Evading the RPC Tax in Datacenters
Arash Pourhabibi, Mark Sutherland (EPFL); Alexandros Daglis (Georgia Institute of Technology); Babak Falsafi (EPFL)

3:15 PM EDT3:30 PM EDT
Equinox: Training (for Free) on a Custom Inference Accelerator
Mario Drummond (CodeDepot); Louis Coulon, Arash Pourhabibi, Ahmet Yuzuguler, Babak Falsafi, Martin Jaggi (EPFL)

3:30 PM EDT3:45 PM EDT
MithriLog: Near-Storage Accelerator for High-Performance Log Analytics
Seongyoung Kang, Jiyoung An (University of California, Irvine); Jinpyo Kim (VMware); Sang-Woo Jun (University of California, Irvine)

3:45 PM EDT4:00 PM EDT
PointAcc: Efficient Point Cloud Accelerator
Yujun Lin, Zhekai Zhang, Haotian Tang, Hanrui Wang, Song Han (MIT)

4:00 PM EDT4:15 PM EDT
A Hardware Accelerator for Protocol Buffers
Sagar Karandikar (UC Berkeley); Chris Leary, Chris Kennelly (Google); Jerry Zhao, Dinesh Parimi (UC Berkeley); Borivoje Nikolic (University of California, Berkeley); Krste Asanovic (University of California Berkeley); Parthasarathy Ranganathan (Google)

4:15 PM EDT5:15 PM EDT: Business Meeting

23:15 (EEST/Athens)


Day 2: Wednesday, October 20

10:00 AM EDT11:00 AM EDT: Keynote by Anastasia Ailamaki (EPFL)

17:00 (EEST/Athens)

Anastasia Ailamaki headshot Abstract
Critical sectors such as business, health, and economy, are powered by data-driven decisions made using data exploration tools. The efficiency of these tools, however, is mitigated by increasing heterogeneity: On one hand, the diversity on data formats and workloads requires either building task-specialized systems or transforming workloads to match the expectations of a single system, sacrificing expressiveness and structural information. On the other hand, heterogeneity in hardware platforms requires task specialization to the microarchitecture of the device at hand, e.g., a CPU or a GPU. Relying on pre-set workload expectations or microarchitectural parameters results in long and convoluted critical execution paths. There is a tradeoff between rigid, task- and device-optimized systems or inefficient, general-purpose, hardware-oblivious systems.

To maintain optimal execution of diverse ad-hoc tasks on any hardware platform we need real-time intelligence, i.e., dynamic optimization of the critical code paths during execution, when all relevant information is available. Real-time intelligent systems learn and use information as the user's requests are executed to build optimized data access and filtering. I will show how a real-time intelligent database system specializes and self-optimizes itself to hardware, workload, and data at runtime, enabling next-generation database systems to exploit the capabilities of modern hardware advancements.


Bio
Anastasia Ailamaki is a Professor of Computer and Communication Sciences at the École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland and the co-founder of RAW Labs SA, a Swiss company developing real-time analytics infrastructures for heterogeneous big data from multiple sources. She earned a Ph.D. in Computer Science from the University of Wisconsin–Madison in 2000. She received the 2019 ACM SIGMOD Edgar F. Codd Innovations and the 2020 VLDB Women in Database Research Award. She is also the recipient of an ERC Consolidator Award (2013), the Finmeccanica endowed chair from the Computer Science Department at Carnegie Mellon (2007), a European Young Investigator Award from the European Science Foundation (2007), an Alfred P. Sloan Research Fellowship (2005), an NSF CAREER award (2002), and ten best-paper awards in database, storage, and computer architecture conferences. She is an ACM fellow, an IEEE fellow, the Laureate for the 2018 Nemitsas Prize in Computer Science, and an elected member of the Swiss, the Belgian, the Greek, and the Cypriot National Research Councils. She is a member of the Academia Europaea and of the World Economic Forum Expert Network.

11:00 AM EDT11:15 AM EDT: Break (socialize on GatherTown)

18:00 (EEST/Athens)

11:15 AM EDT12:15 PM EDT: Student Research Competition Talks

18:15 (EEST/Athens)

12:15 PM EDT1:30 PM EDT

19:15 (EEST/Athens)
Session Chair: Divya Mahajan (Microsoft)
12:15 PM EDT12:30 PM EDT
Archytas: A Framework for Synthesizing and Dynamically Optimizing Accelerators for Robotic Localization
Weizhuang Liu (Tianjin University); Bo Yu (PerceptIn); Yiming Gan (University of Rochester); Qiang Liu (Tianjin University); Jie Tang (South China University of Technology); Shaoshan Liu (PerceptIn); Yuhao Zhu (University of Rochester)

12:30 PM EDT12:45 PM EDT
HoloAR: On-the-Fly Optimization of 3D Holographic Processing for Augmented Reality
Shulin Zhao, Haibo Zhang, Cyan Subhra Mishra, Sandeepa Bhuyan, Ziyu Ying, Mahmut Taylan Kandemir, Anand Sivasubramaniam, Chita Das (The Pennsylvania State University)

12:45 PM EDT1:00 PM EDT
NOVIA: A Framework for Discovering Non-Conventional Inline Accelerators
David Trilla, John-David Wellman, Alper Buyuktosunoglu, Pradip Bose (IBM Research)

1:00 PM EDT1:15 PM EDT
Noema: Hardware-Efficient Template Matching for Neural Population Pattern Detection
Ameer Abdelhadi, Eugene Sha, Ciaran Bannon (University of Toronto); Hendrik Steenland (NeuroTek Innovative Technology Inc.); Andreas Moshovos (University of Toronto)

1:15 PM EDT1:30 PM EDT
SquiggleFilter: An Accelerator for Portable Virus Detection
Tim Dunn, Harisankar Sadasivan, Jack Wadden, Kush Goliya, Kuan-Yu Chen, David Blaauw, Reetuparna Das, Satish Narayanasamy (University of Michigan)
Session Chair: Dongyoon Lee (Stony Brook)
12:15 PM EDT12:30 PM EDT
UC-Check: Characterizing Micro-Operation Caches in x86 Processors and Implications in Security and Performance
Joonsung Kim, Hamin Jang, Hunjun Lee, Seungho Lee, Jangwoo Kim (Seoul National University)

12:30 PM EDT12:45 PM EDT
Network-on-Chip Microarchitecture-Based Covert Channel in GPUs
Jaeguk Ahn, Jiho Kim, Hans Kasan (KAIST); Leila Delshadtehrani (Boston University); Wonjun Song (Kangwon University); Ajay Joshi (Boston University); John Kim (KAIST)

12:45 PM EDT1:00 PM EDT
Validation of Side-Channel Models via Observation Refinement
Pablo Buiras (KTH Royal Institute of Technology); Hamed Nemati (Stanford University / CISPA Helmholtz Center for Information Security); Andreas Lindner, Roberto Guanciale (KTH Royal Institute of Technology)

1:00 PM EDT1:15 PM EDT
GhostMinion: A Strictness-Ordered Cache System for Spectre Mitigation
Sam Ainsworth (University of Edinburgh)

1:15 PM EDT1:30 PM EDT
Speculative Privacy Tracking (SPT): Leaking Information From Speculative Execution Without Compromising Privacy
Rutvik Choudhary, Jiyong Yu, Christopher Fletcher (University of Illinois Urbana-Champaign); Adam Morrison (Tel Aviv University)

1:30 PM EDT1:45 PM EDT: Break (socialize on GatherTown)

20:30 (EEST/Athens)

1:45 PM EDT3:00 PM EDT

20:45 (EEST/Athens)
Session Chair: Amro Awad (NC State)
1:45 PM EDT2:00 PM EDT
HARP: Practically and Effectively Identifying Uncorrectable Errors in Memory Chips That Use On-Die Error-Correcting Codes
Minesh Patel, Geraldo Francisco de Oliveira Junior, Onur Mutlu (ETH Zurich)

2:00 PM EDT2:15 PM EDT
Characterizing and Mitigating Soft Errors in GPU DRAM
Michael Sullivan, Nirmal Saxena, Mike O'Connor, Donghyuk Lee, Paul Racunas, Saurabh Hukerikar, Timothy Tsai, Siva Kumar Sastry Hari, Stephen W. Keckler (NVIDIA)

2:15 PM EDT2:30 PM EDT
Turnpike: Lightweight Soft Error Resilience for In-Order Cores
Jianping Zeng (Purdue University); Hongjune Kim, Jaejin Lee (Seoul National University); Changhee Jung (Purdue University)

2:30 PM EDT2:45 PM EDT
Effective Processor Verification with Logic Fuzzer Enhanced Co-Simulation
Nursultan Kabylkas (UC Santa Cruz); Tommy Thorn (Esperanto Technologies); Shreesha Srinath (Intel); Polychronis Xekalakis (NVIDIA); Jose Renau (UC Santa Cruz)

2:45 PM EDT3:00 PM EDT
Synthesizing Formal Models of Hardware from RTL for Efficient Verification of Memory Model Implementations
Yao Hsiao (Stanford University); Dominic P. Mulligan, Nikos Nikoleris, Gustavo Petri (Arm Research); Caroline Trippel (Stanford University)
Session Chair: Adwait Jog (William & Mary)
1:45 PM EDT2:00 PM EDT
Ohm-GPU: Integrating New Optical Network and Heterogeneous Memory into GPU Multi-Processors
Jie Zhang, Myoungsoo Jung (KAIST)

2:00 PM EDT2:15 PM EDT
Intersection Prediction for Accelerated GPU Ray Tracing
Lufei Liu, Wesley Chang (University of British Columbia); Francois Demoullin (Qualcomm); Yuan Hsi Chou, Mohammadreza Saed (University of British Columbia); David Pankratz (University of Alberta); Tyler Nowicki (Huawei Technologies); Tor M. Aamodt (University of British Columbia)

2:15 PM EDT2:30 PM EDT
Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads
Cesar A. Baddouh (Avalos Baddouh); Mahmoud Khairy (Purdue University); Roland N. Green (Cerebras Systems Inc.); Mathias Payer (EPFL); Timothy G. Rogers (Purdue University)

2:30 PM EDT2:45 PM EDT
AccelWattch: A Power Modeling Framework for Modern GPUs
Vijay Kandiah (Northwestern University); Scott Peverelle (Intel); Mahmoud Khairy, Junrui Pan, Amogh Manjunath, Timothy G. Rogers (Purdue University); Tor Aamodt (University of British Columbia); Nikos Hardavellas (Northwestern University)

2:45 PM EDT3:00 PM EDT
Vortex: Extending the RISC-V ISA for GPGPU and 3D-Graphics Research
Blaise Tine, Krishna Praveen Yalamarthy, Fares Elsabbagh, Kim Hyesoon (Georgia Institute of Technology)

3:00 PM EDT4:15 PM EDT

22:00 (EEST/Athens)
Session Chair: Pedro Trancoso (Chalmers)
3:00 PM EDT3:15 PM EDT
Enabling Branch-Mispredict Level Parallelism by Selectively Flushing Instructions
Stijn Eyerman, Wim Heirman, Sam Van Den Steen, Ibrahim Hur (Intel)

3:15 PM EDT3:30 PM EDT
PDede: Partitioned, Deduplicated, Delta Branch Target Buffer
Niranjan K Soundararajan (Intel Labs); Peter Braun (University of California, Santa Cruz); Tanvir Ahmed Khan, Baris Kasikci (University of Michigan); Heiner Litz (University of California, Santa Cruz); Sreenivas Subramoney (Intel Labs)

3:30 PM EDT3:45 PM EDT
Leveraging Targeted Value Prediction to Unlock New Hardware Strength Reduction Potential
Arthur Perais (Centre National de la Recherche Scientifique)

3:45 PM EDT4:00 PM EDT
Branch Runahead: An Alternative to Branch Prediction for Impossible to Predict Branches
Stephen Pruett, Yale Patt (UT Austin)

4:00 PM EDT4:15 PM EDT
Twig: Profile-Guided BTB Prefetching for Data Center Applications
Tanvir Ahmed Khan, Nathan Brown, Akshitha Sriraman (University of Michigan); Niranjan K Soundararajan (Intel Labs); Rakesh Kumar (Norwegian University of Science and Technology); Joseph Devietti (University of Pennsylvania); Sreenivas Subramoney (Intel Labs); Gilles A Pokam (Intel); Heiner Litz (University of California, Santa Cruz); Baris Kasikci (University of Michigan)
Session Chair: Lisa Wu Wills (Duke)
3:00 PM EDT3:15 PM EDT
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference
Thierry Tambe, Coleman Hooper, Lillian Pentecost, Tianyu Jia, En-Yu Yang (Harvard University); Marco Donato (Tufts University); Victor Sanh (Hugging Face); Paul Whatmough (Arm Research / Harvard); Alexander M. Rush (Cornell University); David Brooks, Gu-Yeon Wei (Harvard University)

3:15 PM EDT3:30 PM EDT
HiMA: A Fast and Scalable History-Based Memory Access Engine for Differentiable Neural Computer
Yaoyu Tao, Zhengya Zhang (University of Michigan, Ann Arbor)

3:30 PM EDT3:45 PM EDT
FPRaker: A Processing Element for Accelerating Neural Network Training
Omar Mohamed Awad (University of Toronto/Huawei); Mostafa Mahmoud (Toronto); Isak Edo (University of Toronto/Arm); Ali Hadi Zadeh, Ciaran Bannon, Anand Jayarajan (University of Toronto); Gennady Pekhimenko, Andreas Moshovos (University of Toronto/Vector Institute)

3:45 PM EDT4:00 PM EDT
RecPipe: Co-Designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance
Udit Gupta (Harvard University/FAIR); Samuel Hsia, Jeff Zhang, Mark Wilkening, Javin Pombra (Harvard University); Hsien-Hsin Sean Lee (Facebook AI Research); Gu-Yeon Wei (Harvard University); Carole-Jean Wu (Facebook/ASU); David Brooks (Harvard University)

4:00 PM EDT4:15 PM EDT
Shift-BNN: Highly-Efficient Probabilistic Bayesian Neural Network Training via Memory-Friendly Pattern Retrieving
Qiyu Wan (University of Houston); Haojun Xia (University of Sydney); Xingyao Zhang (University of Washington); Lening Wang (University of Houston); Shuaiwen Leon Song (University of Sydney / UW Seattle); Xin Fu (University of Houston)

4:15 PM EDT4:30 PM EDT: Break (socialize on GatherTown)

23:15 (EEST/Athens)

4:30 PM EDT5:45 PM EDT: Panel

23:30 (EEST/Athens)

Sarita Adve headshot Sarita Adve is the Richard T. Cheng Professor of Computer Science at the University of Illinois at Urbana-Champaign. Her research interests span the system stack, ranging from hardware to applications. She co-developed the memory models for the C++ and Java programming languages based on her early work on data-race-free models. Recently, her group released ILLIXR (Illinois Extended Reality testbed), the first open source extended reality system. She is also known for her work on heterogeneous systems and software-driven approaches for hardware resiliency. She is a member of the American Academy of Arts and Sciences, a fellow of the ACM and IEEE, and a recipient of the ACM/IEEE-CS Ken Kennedy award, the Anita Borg Institute Women of Vision award in innovation, the ACM SIGARCH Maurice Wilkes award, and the University of Illinois campus award for excellence in graduate student mentoring. As ACM SIGARCH chair, she co-founded the CARES movement, winner of the CRA distinguished service award, to address discrimination and harassment in Computer Science research events. She received her Ph.D. from the University of Wisconsin-Madison and her B.Tech. from the Indian Institute of Technology, Bombay.


Koji Inoue headshot Koji Inoue received the B.E. and M.E. degrees in computer science from Kyushu Institute of Technology, Japan in 1994 and 1996, respectively. He received the Ph.D. degree in Department of Computer Science and Communication Engineering, Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan in 2001. In 1999, he joined Halo LSI Design & Technology, Inc., NY, as a circuit designer. He is currently a professor of the Department of Advanced Information Technology, Kyushu University. His research interests include power-aware computing, high-performance computing, secure computing, nano-photonic computing, and superconductor computing.


Onur Mutlu headshot Onur Mutlu is a Professor of Computer Science at ETH Zurich. He is also a faculty member at Carnegie Mellon University, where he previously held the Strecker Early Career Professorship. His current broader research interests are in computer architecture, systems, hardware security, and bioinformatics. A variety of techniques he, along with his group and collaborators, has invented over the years have influenced industry and have been employed in commercial microprocessors and memory/storage systems. He obtained his PhD and MS in ECE from the University of Texas at Austin and BS degrees in Computer Engineering and Psychology from the University of Michigan, Ann Arbor. He started the Computer Architecture Group at Microsoft Research (2006-2009), and held various product and research positions at Intel Corporation, Advanced Micro Devices, VMware, and Google. He received the IEEE Computer Society Edward J. McCluskey Technical Achievement Award, the ACM SIGARCH Maurice Wilkes Award, the inaugural IEEE Computer Society Young Computer Architect Award, the inaugural Intel Early Career Faculty Award, US National Science Foundation CAREER Award, Carnegie Mellon University Ladd Research Award, faculty partnership awards from various companies, and a healthy number of best paper or "Top Pick" paper recognitions at various computer systems, architecture, and hardware security venues. He is an ACM Fellow "for contributions to computer architecture research, especially in memory systems", IEEE Fellow for "contributions to computer architecture research and practice", and an elected member of the Academy of Europe (Academia Europaea).


Per Stenström headshot Per Stenström is a professor of computer engineering at Chalmers University of Technology, Sweden. His research interests are in Computer Architecture. He has authored or co-authored four textbooks and about 200 publications and 20 patents in this area. He is known for his contributions to high-performance memory systems which has awarded him a Fellow of the ACM and the IEEE among other awards. He has acted as editor-in-chief and program chair of prestigious scientific journals and conferences. He has been program chair/co-chair of the IEEE/ACM Symposium on Computer Architecture, the IEEE High-Performance Computer Architecture Symposium, the IEEE Parallel and Distributed Processing Symposium, ACM International Conference on Supercomputing and IEEE/ACM PACT. He is a member of the Royal Swedish Academy of Engineering Sciences, Academia Europaea and the Royal Spanish Academy of Engineering Science.


Sree Subramoney headshot Sreenivas Subramoney is currently a Senior Principal Engineer at Intel Corporation where he heads the Processor Architecture Research (PAR) Lab at Intel Labs. PAR Lab leads research into futuristic high-performance and highly-secure CPUs to extend Intel's general-purpose compute leadership, and into next-generation processor architectures for AI and heterogeneous memory architectures. In prior roles, Sreenivas led the performance architecture of multiple generations of Intel's flagship microprocessor client and server CPUs, co-developed Intel's first Java Virtual Machine and was a key developer of the 64-bit Linux kernel for Intel's Itanium product family. Sree received his B.Tech. in Computer Science and Engineering from IIT Madras, India in 1995, and his M.S. degree in Computer Engineering at The Ohio State University, USA in 1996 and has worked for Intel since then.


Steve Keckler headshot Steve Keckler is the Vice President of Architecture Research at NVIDIA and Adjunct Professor of Computer Science at the University of Texas at Austin, where he served as a full-time faculty member (full professor with tenure) from 1998 to 2012. At NVIDIA, Dr. Keckler focuses on parallel, energy-efficient architectures that span mobile through supercomputing platforms. He is a Fellow of the ACM, a Fellow of the IEEE, an Alfred P. Sloan Research Fellow, and a recipient of the NSF CAREER award, the ACM Grace Murray Hopper award, the President's Associates Teaching Excellence Award at UT-Austin, and the Edith and Peter O'Donnell award for Engineering. He earned a B.S. in Electrical Engineering from Stanford University and an M.S. and a Ph.D. in Computer Science from the Massachusetts Institute of Technology.



Day 3: Thursday, October 21

10:00 AM EDT11:00 AM EDT: Keynote by Sean Lie (Cerebras)

17:00 (EEST/Athens)

Sean Lie headshot Abstract
The compute and memory demands from state-of-the-art neural networks have increased several orders of magnitude in just the last couple of years, and there's no end in sight. Traditional forms of scaling chip performance are necessary but far from sufficient to run the ML models of the future. In addition to the chip, end-to-end system and software co-design is the only way to satisfy the performance demand. It requires vertical design across the entire technology stack: from the chip architecture, to the system and cluster design, through the compiler and software, and even unlocking the flexibility to rethink the neural network algorithms themselves.

In this talk, we will explore the fundamental properties of neural networks and why they are not well served by traditional architectures. We will examine how co-design can relax the traditional boundaries between technologies and enable designs specialized for neural networks with new architectural capabilities and performance. This co-design approach enables innovations such as wafer-scale chips, core sparse datapaths, specialized memories and interconnects, novel software mappings and execution models, and highly efficient sparse neural networks. We will explore this rich new design space using the Cerebras architecture as a case study, highlighting design principles and tradeoffs that enable the ML models of the future.


Bio
Sean Lie is co-founder and Chief Hardware Architect at Cerebras Systems, which builds high performance ML accelerators. Prior to Cerebras, Sean was a Fellow and Chief Data Center Architect at AMD where he was responsible for the architecture of the SeaMicro line of distributed servers. He holds a BS and MEng in Electrical Engineering and Computer Science from MIT. Sean's primary interests are in high performance computer architecture and hardware/software codesign in areas including transactional memory, networking, storage, and ML accelerators.

11:00 AM EDT11:15 AM EDT: Break

18:00 (EEST/Athens)

11:15 AM EDT12:15 PM EDT

18:15 (EEST/Athens)

Best Paper Award


Student Research Competition Winners


MICRO Hall of Fame


MICRO Test of Time Award


B. Ramakrishna Rau Award

12:15 PM EDT1:30 PM EDT

19:15 (EEST/Athens)
Session Chair: Yunong Shi (Amazon)
12:15 PM EDT12:30 PM EDT
Exploiting Different Levels of Parallelism in the Quantum Control Microarchitecture for Superconducting Qubits
Mengyu Zhang (Tencent Quantum Laboratory); Lei Xie (Tsinghua University); Zhenxing Zhang, Qiaonian Yu, Guanglei Xi, Hualiang Zhang, Fuming Liu, Yarui Zheng, Yicong Zheng, Shengyu Zhang (Tencent Quantum Laboratory)

12:30 PM EDT12:45 PM EDT
SMART: A Heterogeneous Scratchpad Memory Architecture for Superconductor SFQ-Based Systolic CNN Accelerators
Farzaneh Zokaee, Lei Jiang (Indiana University Bloomington)

12:45 PM EDT1:00 PM EDT
AutoBraid: A Framework for Enabling Efficient Surface Communication in Quantum Computing
Fei Hua, Yanhao Chen, Yuwei Jin (Rutgers University); Chi Zhang (University of Pittsburgh); Ari Hayes (Rutgers University); Youtao Zhang (University of Pittsburgh); Eddy Z. Zhang (Rutgers University)

1:00 PM EDT1:15 PM EDT
JigSaw: Boosting Fidelity of NISQ Programs via Measurement Subsetting
Poulami Das (Georgia Tech); Swamit Tannu (University of Wisconsin-Madison); Moinuddin Qureshi (Georgia Tech)

1:15 PM EDT1:30 PM EDT
ADAPT: Mitigating Idling Errors in Qubits via Adaptive Dynamical Decoupling
Poulami Das (Georgia Tech); Swamit Tannu (University of Wisconsin-Madison); Siddharth Dangwal (IIT Delhi); Moinuddin Qureshi (Georgia Tech)
Session Chair: Tony Nowatzki (UCLA)
12:15 PM EDT12:30 PM EDT
Distilling Bit-Level Sparsity Parallelism for General Purpose Deep Learning Acceleration
Hang Lu (Institute of Computing Technology, Chinese Academy of Sciences); Liang Chang, Chenglong Li, Zixuan Zhu (University of Electronic Science and Technology of China); Shengjian Lu, Yanhuan Liu (Institute of Computing Technology, Chinese Academy of Sciences); Mingzhe Zhang (Institute of Computing Technology, Chinese Academy of Sciences)

12:30 PM EDT12:45 PM EDT
Sanger: A Co-Design Framework for Enabling Sparse Attention using Reconfigurable Architecture
Liqiang Lu, Yicheng Jin, Hangrui Bi, Zizhang Luo (Peking University); Peng Li (Advanced Institute of Information Technology, Peking University); Tao Wang, Yun Liang (Peking University)

12:45 PM EDT1:00 PM EDT
ESCALATE: Boosting the Efficiency of Sparse CNN Accelerator with Kernel Decomposition
Shiyu Li, Edward Hanson (Duke University); Xuehai Qian (University of Southern California); Hai "Helen" Li, Yiran Chen (Duke University)

1:00 PM EDT1:15 PM EDT
SparseAdapt: Runtime Control for Sparse Linear Algebra on a Reconfigurable Accelerator
Subhankar Pal, Aporva Amarnath, Siying Feng (University of Michigan); Michael O'Boyle (University of Edinburgh); Ronald Dreslinski (University of Michigan); Christophe Dubach (McGill University)

1:15 PM EDT1:30 PM EDT
Capstan: A Vector RDA for Sparsity
Alexander Rucker, Matthew Vilim, Tian Zhao (Stanford University); Yaqi Zhang, Raghu Prabhakar (SambaNova Systems, Inc.); Kunle Olukotun (Stanford)

1:30 PM EDT1:45 PM EDT: Break

20:30 (EEST/Athens)

1:45 PM EDT3:00 PM EDT

20:45 (EEST/Athens)
Session Chair: Yingyan (Celine) Lin (Rice)
1:45 PM EDT2:00 PM EDT
Improving Streaming Graph Processing Performance Using Input Knowledge
Abanti Basak, Zheng Qu, Jilan Lin (University of California, Santa Barbara); Alaa R. Alameldeen (Simon Fraser University); Zeshan Chishti (Intel); Yufei Ding, Yuan Xie (University of California, Santa Barbara)

2:00 PM EDT2:15 PM EDT
I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement Through Islandization
Tong Geng (Pacific Northwest National Laboratory); Chunshu Wu (Boston University); Yongan Zhang (Rice University); Cheng Tan, Chenhao Xie (Pacific Northwest National Laboratory); Haoran You (rice.edu); Martin Herbordt (Boston University); Yingyan Lin (Rice University); Ang Li (Pacific Northwest National Laboratory)

2:15 PM EDT2:30 PM EDT
Fifer: Practical Acceleration of Irregular Applications on Reconfigurable Architectures
Quan Nguyen, Daniel Sanchez (MIT)

2:30 PM EDT2:45 PM EDT
Point-X: A Spatial-Locality-Aware Architecture for Energy-Efficient Graph-Based Point-Cloud Deep Learning
Jie-Fang Zhang, Zhengya Zhang (University of Michigan, Ann Arbor)

2:45 PM EDT3:00 PM EDT
JetStream: Graph Analytics on Streaming Data with Event-Driven Hardware Accelerator
Shafiur Rahman, Mahbod Afarin, Nael Abu-Ghazaleh, Rajiv Gupta (University of California, Riverside)
Session Chair: Djordje Jevdjic (NUS)
1:45 PM EDT2:00 PM EDT
Trident: Harnessing Architectural Resources for All Page Sizes in x86 Processors
Venkat Sri Sai Ram, Ashish Panwar, Arkaprava Basu (Indian Institute of Science)

2:00 PM EDT2:15 PM EDT
Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning
Rahul Bera (ETH Zurich); Anant Nori (Intel); Konstantinos Kanellopoulos (ETH Zurich); Taha Shahroodi (TU Delft); Sreenivas Subramoney (Intel Labs); Onur Mutlu (ETH Zurich)

2:15 PM EDT2:30 PM EDT
Morrigan: A Composite Instruction TLB Prefetcher
Georgios Vavouliotis, Lluc Alvarez (Universitat Politecnica de Catalunya / Barcelona Supercomputing Center); Boris Grot (University of Edinburgh); Daniel Jiménez (Texas A&M University); Marc Casas (Barcelona Supercomputing Center)

2:30 PM EDT2:45 PM EDT
Improving Address Translation in Multi-GPUs via Sharing and Spilling Aware TLB Design
Bingyao Li (University of Pittsburgh); Jieming Yin (Lehigh University); Youtao Zhang, Xulong Tang (University of Pittsburgh)

2:45 PM EDT3:00 PM EDT
Increasing GPU Translation Reach by Leveraging Under-Utilized On-Chip Resources
Jagadish B. Kotra, Michael LeBeane (AMD Research); Mahmut Kandemir (Pennsylvania State University); Gabriel H. Loh (AMD Research)

3:00 PM EDT4:15 PM EDT

22:00 (EEST/Athens)
Session Chair: Hoda Naghibijouybari (Binghamton)
3:00 PM EDT3:15 PM EDT
A Deeper Look into RowHammer's Sensitivities: Experimental Analysis of Real DRAM Chips and Implications on Future Attacks and Defenses
Lois Orosa, Abdullah Giray Yaglikci, Haocong Luo (ETH Zurich); Ataberk Olgun (TOBB University of Economics and Technology); Jisung Park, Hasan Hassan, Minesh Patel, Jeremie S. Kim, Onur Mutlu (ETH Zurich)

3:15 PM EDT3:30 PM EDT
Uncovering In-DRAM RowHammer Protection Mechanisms: A New Methodology, Custom RowHammer Patterns, and Implications
Hasan Hassan (ETH Zurich); Yahya Can Tugrul (TOBB University of Economics and Technology); Jeremie S. Kim (ETH Zurich); Victor van der Veen (Qualcomm); Kaveh Razavi, Onur Mutlu (ETH Zurich)

3:30 PM EDT3:45 PM EDT
Soteria: Towards Resilient Integrity-Protected and Encrypted Non-Volatile Memories
Kazi Abu Zubair (North Carolina State University); Sudhanva Gurumurthi, Vilas Sridharan (AMD); Amro Awad (North Carolina State University)

3:45 PM EDT4:00 PM EDT
Bonsai Merkle Forests: Efficiently Achieving Crash Consistency in Secure Persistent Memory
Alexander Freij, Huiyang Zhou (North Carolina State University); Yan Solihin (University of Central Florida)

4:00 PM EDT4:15 PM EDT
Dolos: Improving the Performance of Persistent Applications in ADR-Supported Secure Memory
Xijing Han, James Tuck, Amro Awad (North Carolina State University)
Session Chair: Lu Peng (LSU)
3:00 PM EDT3:15 PM EDT
The Laplace Microarchitecture for Tracking Data Uncertainty and Its Implementation in a RISC-V Processor
Vasileios Tsoutsouras (University of Cambridge / Signaloid); Orestis Kaparounakis (Signaloid); Bilgesu Bilgin, Chatura Samarakoon, James Meech, Jan Heck (University of Cambridge); Phillip Stanley-Marbell (University of Cambridge / Signaloid)

3:15 PM EDT3:30 PM EDT
Post-Fabrication Microarchitecture
Chanchal Kumar (North Carolina State University / ARM Ltd.); Anirudh Seshadri (North Carolina State University); Aayush Chaudhary (North Carolina State University / Samsung); Shubham Bhawalkar (North Carolina State University / Nuvia Inc.); Rohit Singh, Eric Rotenberg (North Carolina State University)

3:30 PM EDT3:45 PM EDT
PCCS: Processor-Centric Contention Slowdown Model for Heterogeneous System-on-Chips
Yuanchao Xu (North Carolina State University); Mehmet Esat Belviranli (Colorado School of Mines); Xipeng Shen (North Carolina State University / Facebook); Jeffrey Vetter (Oak Ridge National Laboratory)

3:45 PM EDT4:00 PM EDT
ITSLF: Inter-Thread Store-to-Load Forwarding in Simultaneous Multithreading
Josue Feliu, Alberto Ros, Manuel E. Acacio (Universidad de Murcia); Stefanos Kaxiras (Uppsala University)

4:00 PM EDT4:15 PM EDT
ENMC: Extreme Near-Memory Classification via Approximate Screening
Liu Liu, Jilan Lin, Zheng Qu, Yufei Ding, Yuan Xie (University of California, Santa Barbara)

4:15 PM EDT4:30 PM EDT: Closing Remarks

23:15 (EEST/Athens)