In this talk, I will discuss important factors in designing a high-performance processor core, giving a historical perspective of where the industry has been and highlighting some of the attributes of our latest Zen3 core.
Mike Clark is an AMD Corporate Fellow and is the Chief Architect for x86 cores. He graduated from the University of Illinois in 1993 with a B.S. in computer engineering and started work on the K5 processor at AMD. He served as chief architect of the Zen core and has contributed to every x86 processor from K5 to the latest Zen generation. A co-architect of the AMD64 ISA as well as the AMD-V virtualization extension, he has 28 patents on these and other technologies. He also received a master's degree in computer engineering from the University of Texas in 2003 while working at AMD.
Critical sectors such as business, health, and economy, are powered by data-driven decisions made using data exploration tools. The efficiency of these tools, however, is mitigated by increasing heterogeneity: On one hand, the diversity on data formats and workloads requires either building task-specialized systems or transforming workloads to match the expectations of a single system, sacrificing expressiveness and structural information. On the other hand, heterogeneity in hardware platforms requires task specialization to the microarchitecture of the device at hand, e.g., a CPU or a GPU. Relying on pre-set workload expectations or microarchitectural parameters results in long and convoluted critical execution paths. There is a tradeoff between rigid, task- and device-optimized systems or inefficient, general-purpose, hardware-oblivious systems.
To maintain optimal execution of diverse ad-hoc tasks on any hardware platform we need real-time intelligence, i.e., dynamic optimization of the critical code paths during execution, when all relevant information is available. Real-time intelligent systems learn and use information as the user's requests are executed to build optimized data access and filtering. I will show how a real-time intelligent database system specializes and self-optimizes itself to hardware, workload, and data at runtime, enabling next-generation database systems to exploit the capabilities of modern hardware advancements.
Anastasia Ailamaki is a Professor of Computer and Communication Sciences at the École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland and the co-founder of RAW Labs SA, a Swiss company developing real-time analytics infrastructures for heterogeneous big data from multiple sources. She earned a Ph.D. in Computer Science from the University of Wisconsin–Madison in 2000. She received the 2019 ACM SIGMOD Edgar F. Codd Innovations and the 2020 VLDB Women in Database Research Award. She is also the recipient of an ERC Consolidator Award (2013), the Finmeccanica endowed chair from the Computer Science Department at Carnegie Mellon (2007), a European Young Investigator Award from the European Science Foundation (2007), an Alfred P. Sloan Research Fellowship (2005), an NSF CAREER award (2002), and ten best-paper awards in database, storage, and computer architecture conferences. She is an ACM fellow, an IEEE fellow, the Laureate for the 2018 Nemitsas Prize in Computer Science, and an elected member of the Swiss, the Belgian, the Greek, and the Cypriot National Research Councils. She is a member of the Academia Europaea and of the World Economic Forum Expert Network.
Sarita Adve is the Richard T. Cheng Professor of Computer Science at the University of Illinois at Urbana-Champaign. Her research interests span the system stack, ranging from hardware to applications. She co-developed the memory models for the C++ and Java programming languages based on her early work on data-race-free models. Recently, her group released ILLIXR (Illinois Extended Reality testbed), the first open source extended reality system. She is also known for her work on heterogeneous systems and software-driven approaches for hardware resiliency. She is a member of the American Academy of Arts and Sciences, a fellow of the ACM and IEEE, and a recipient of the ACM/IEEE-CS Ken Kennedy award, the Anita Borg Institute Women of Vision award in innovation, the ACM SIGARCH Maurice Wilkes award, and the University of Illinois campus award for excellence in graduate student mentoring. As ACM SIGARCH chair, she co-founded the CARES movement, winner of the CRA distinguished service award, to address discrimination and harassment in Computer Science research events. She received her Ph.D. from the University of Wisconsin-Madison and her B.Tech. from the Indian Institute of Technology, Bombay.
Koji Inoue received the B.E. and M.E. degrees in computer science from Kyushu Institute of Technology, Japan in 1994 and 1996, respectively. He received the Ph.D. degree in Department of Computer Science and Communication Engineering, Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan in 2001. In 1999, he joined Halo LSI Design & Technology, Inc., NY, as a circuit designer. He is currently a professor of the Department of Advanced Information Technology, Kyushu University. His research interests include power-aware computing, high-performance computing, secure computing, nano-photonic computing, and superconductor computing.
Onur Mutlu is a Professor of Computer Science at ETH Zurich. He is also a faculty member at Carnegie Mellon University, where he previously held the Strecker Early Career Professorship. His current broader research interests are in computer architecture, systems, hardware security, and bioinformatics. A variety of techniques he, along with his group and collaborators, has invented over the years have influenced industry and have been employed in commercial microprocessors and memory/storage systems. He obtained his PhD and MS in ECE from the University of Texas at Austin and BS degrees in Computer Engineering and Psychology from the University of Michigan, Ann Arbor. He started the Computer Architecture Group at Microsoft Research (2006-2009), and held various product and research positions at Intel Corporation, Advanced Micro Devices, VMware, and Google. He received the IEEE Computer Society Edward J. McCluskey Technical Achievement Award, the ACM SIGARCH Maurice Wilkes Award, the inaugural IEEE Computer Society Young Computer Architect Award, the inaugural Intel Early Career Faculty Award, US National Science Foundation CAREER Award, Carnegie Mellon University Ladd Research Award, faculty partnership awards from various companies, and a healthy number of best paper or "Top Pick" paper recognitions at various computer systems, architecture, and hardware security venues. He is an ACM Fellow "for contributions to computer architecture research, especially in memory systems", IEEE Fellow for "contributions to computer architecture research and practice", and an elected member of the Academy of Europe (Academia Europaea).
Per Stenström is a professor of computer engineering at Chalmers University of Technology, Sweden. His research interests are in Computer Architecture. He has authored or co-authored four textbooks and about 200 publications and 20 patents in this area. He is known for his contributions to high-performance memory systems which has awarded him a Fellow of the ACM and the IEEE among other awards. He has acted as editor-in-chief and program chair of prestigious scientific journals and conferences. He has been program chair/co-chair of the IEEE/ACM Symposium on Computer Architecture, the IEEE High-Performance Computer Architecture Symposium, the IEEE Parallel and Distributed Processing Symposium, ACM International Conference on Supercomputing and IEEE/ACM PACT. He is a member of the Royal Swedish Academy of Engineering Sciences, Academia Europaea and the Royal Spanish Academy of Engineering Science.
Sreenivas Subramoney is currently a Senior Principal Engineer at Intel Corporation where he heads the Processor Architecture Research (PAR) Lab at Intel Labs. PAR Lab leads research into futuristic high-performance and highly-secure CPUs to extend Intel's general-purpose compute leadership, and into next-generation processor architectures for AI and heterogeneous memory architectures. In prior roles, Sreenivas led the performance architecture of multiple generations of Intel's flagship microprocessor client and server CPUs, co-developed Intel's first Java Virtual Machine and was a key developer of the 64-bit Linux kernel for Intel's Itanium product family. Sree received his B.Tech. in Computer Science and Engineering from IIT Madras, India in 1995, and his M.S. degree in Computer Engineering at The Ohio State University, USA in 1996 and has worked for Intel since then.
Steve Keckler is the Vice President of Architecture Research at NVIDIA and Adjunct Professor of Computer Science at the University of Texas at Austin, where he served as a full-time faculty member (full professor with tenure) from 1998 to 2012. At NVIDIA, Dr. Keckler focuses on parallel, energy-efficient architectures that span mobile through supercomputing platforms. He is a Fellow of the ACM, a Fellow of the IEEE, an Alfred P. Sloan Research Fellow, and a recipient of the NSF CAREER award, the ACM Grace Murray Hopper award, the President's Associates Teaching Excellence Award at UT-Austin, and the Edith and Peter O'Donnell award for Engineering. He earned a B.S. in Electrical Engineering from Stanford University and an M.S. and a Ph.D. in Computer Science from the Massachusetts Institute of Technology.
The compute and memory demands from state-of-the-art neural networks have increased several orders of magnitude in just the last couple of years, and there's no end in sight. Traditional forms of scaling chip performance are necessary but far from sufficient to run the ML models of the future. In addition to the chip, end-to-end system and software co-design is the only way to satisfy the performance demand. It requires vertical design across the entire technology stack: from the chip architecture, to the system and cluster design, through the compiler and software, and even unlocking the flexibility to rethink the neural network algorithms themselves.
In this talk, we will explore the fundamental properties of neural networks and why they are not well served by traditional architectures. We will examine how co-design can relax the traditional boundaries between technologies and enable designs specialized for neural networks with new architectural capabilities and performance. This co-design approach enables innovations such as wafer-scale chips, core sparse datapaths, specialized memories and interconnects, novel software mappings and execution models, and highly efficient sparse neural networks. We will explore this rich new design space using the Cerebras architecture as a case study, highlighting design principles and tradeoffs that enable the ML models of the future.
Sean Lie is co-founder and Chief Hardware Architect at Cerebras Systems, which builds high performance ML accelerators. Prior to Cerebras, Sean was a Fellow and Chief Data Center Architect at AMD where he was responsible for the architecture of the SeaMicro line of distributed servers. He holds a BS and MEng in Electrical Engineering and Computer Science from MIT. Sean's primary interests are in high performance computer architecture and hardware/software codesign in areas including transactional memory, networking, storage, and ML accelerators.
Best Paper Award
Student Research Competition Winners
MICRO Hall of Fame
MICRO Test of Time Award
B. Ramakrishna Rau Award