Main Program

Start time Monday Oct 16, 2017
7:00 Breakfast
8:00 Opening Remarks  

Keynote 1: Krisztián Flautner, VP Technology, ARM
To a Trillion and Beyond: the Future of the Internet of Things

As we move from an era of human-centric connectivity to a new machine-centric era, there are numerous opportunities to innovate, providing solutions that can scale to trillions of interconnected intelligent devices. The defining feature of this era will be that computers become invisible and move from attention-grabbing devices into the background. For some this is utopia, for others it conjures the imagery of George Orwell. This talk will examine the technology underpinnings of this interconnected future and explore how the notion of trust must evolve to keep pace with both engineering and social developments.

Bio: Krisztián Flautner is vice president of technology at ARM. Previously, Kris was general manager of the Internet of Things Business Unit and VP of Research and Development at ARM. He is focused on new business opportunities and the proliferation of ARM technologies. He received a PhD in computer science and engineering, along with a number of other degrees, from the University of Michigan, where he has also served as a visiting scholar. He has authored or co-authored over 80 publications, including key contributions on computer architecture, software, and microarchitecture. Flautner received various best paper awards including the 2017 ISCA influential paper award for groundbreaking research in power-efficient computing, to show that at least one of these ideas received broad adoption by the industry and researchers alike.


Lightning Session I (1.5 min/paper)

10:20 Break  
10:40 1-A DRAM. Daniel Jimenez, TAMU 1-B Accelerators. Krste Asanovic, Berkeley
  Banshee: Bandwidth-Efficient DRAM Caching Via Software/Hardware Cooperation
Xiangyao Yu:MIT; Christopher J. Hughes:Intel; Nadathur Satish:Intel; Onur Mutlu:ETH; Srinivas Devadas:MIT
UDP: A Programmable Accelerator for Extract-Transform-Load Workloads and More
Yuanwei Fang:University of Chicago; Chen Zou:University of Chicago; Aaron J. Elmore:University of Chicago; Andrew A. Chien:University of Chicago and Argonne National Laboratory
  ConTutto - A Novel FPGA-based Prototyping Platform Enabling Innovation in the Memory Subsystem of a Server Class Processor
Bharat Sukhwani:IBM Research; Thomas Roewer:IBM Research; Charles L. Haymes:IBM Research; Kyu-Hyoun Kim:IBM Research; Adam J. McPadden:IBM; Daniel M. Dreps:IBM; Dean Sanner:IBM; Jan Van Lunteren:IBM Research; Sameh Asaad:IBM Research
UNFOLD: A Memory-Efficient Speech Recognizer Using On-The-Fly WFST Composition
Reza Yazdani:Universitat Politecnica de Catalunya; Jose-Maria Arnau:Universitat Politecnica de Catalunya; Antonio Gonzalez:Universitat Politecnica de Catalunya
  Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content
Samira Khan:University of Virginia; Chris Wilkerson:Intel; Zhe Wang:Intel; Alaa Alameldeen:Intel; Donghyuk Lee:NVIDIA; Onur Mutlu:ETH Zurich
IDEAL: Image DEnoising AcceLerator
Mostafa Mahmoud:University of Toronto; Bojian Zheng:University of Toronto; Alberto Delmas Lascorz:University of Toronto; Felix Heide: Stanford University/Algolux; Jonathan Assouline: Algolux; Paul Boucher: Algolux; Emmanuel Onzon: Algolux;Andreas Moshovos:University of Toronto
  Fine-Grained DRAM: Energy Efficient DRAM for Extreme Bandwidth Systems
Mike O'Connor:NVIDIA / UT-Austin; Niladrish Chatterjee:NVIDIA; Donghyuk Lee:NVIDIA; John Wilson:NVIDIA; Aditya Agrawal:NVIDIA; Stephen W. Keckler:NVIDIA / UT-Austin; William J. Dally:NVIDIA / Stanford
Pipelining a Triggered Processing Element
Thomas J. Repetti:Columbia University; Joao Pedro Cerqueira:Columbia University; Martha A. Kim:Columbia University; Mingoo Seok:Columbia University
12:00  Lunch   
13:00  Legends of MICRO   
14:00 2-A GPUs-I. Brad Beckman, AMD 2-B Non-Volatile Memory/Storage. Samira Khan, U. Virgina
  Efficient Exception Handling Support for GPUs
Ivan Tanasic:Universitat Politecnica de Catalunya / Barcelona Supercomputing Center; Isaac Gelado:NVIDIA; Marc Jorda:Barcelona Supercomputing Center; Eduard Ayguade:Universitat Politecnica de Catalunya / Barcelona Supercomputing Center; Nacho Navarro:Universitat Politecnica de Catalunya / Barcelona Supercomputing Center
Proteus: A Flexible and Fast Software Supported Hardware Logging approach for NVM
Seunghee Shin:North Carolina State University; Satish Kumar Tirukkovalluri:North Carolina State University; James Tuck:North Carolina State University; Yan Solihin:North Carolina State University
  Beyond the Socket: NUMA-Aware GPUs
Ugljesa Milic:Barcelona Supercomputing Center / Universitat Politecnica de Catalunya; Oreste Villa:Nvidia; Evgeny Bolotin:Nvidia; Akhil Arunkumar:Arizona State University; Eiman Ebrahimi:Nvidia; Aamer Jaleel:Nvidia; Alex Ramirez:Google; David Nellans:Nvidia
Efficient Support of Position Independence on Non-Volatile Memory
Guoyang Chen:Qualcomm; Lei Zhang:NCSU; Richa Budhiraja:Qualcomm; Xipeng Shen:NCSU; Youfeng Wu:Intel
  Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes
Rachata Ausavarungnirun:Carnegie Mellon University; Joshua Landgraf:UT Austin; Vance Miller:UT Austin; Saugata Ghose:Carnegie Mellon University; Jayneel Gandhi:VMware Research Group; Christopher J. Rossbach:UT Austin and VMware Research Group; Onur Mutlu:ETH Zurich and Carnegie Mellon University
Incidental Computing on IoT Nonvolatile Processors
Kaisheng Ma: Penn State; Xuqing Li: Penn State; Jinyang Li:Tsinghua Univ.; Yongpan Liu: Tsinghua Univ.; Yuan Xie:UCSB; Jack Sampson:Penn State; Mahmut Taylan Kandemir:Penn State; Vijaykrishnan Narayanan:Penn State;
  RegLess: Just-in-Time Operand Staging for GPUs
John Kloosterman:University of Michigan; Jon Beaumont:University of Michigan; Davoud Anoushe Jamshidi:University of Michigan; Jonathan Bailey:University of Michigan; Trevor Mudge:University of Michigan; Scott Mahlke:University of Michigan
Summarizer: Trading Communication with Computing Near Storage
Gunjae Koo:University of Southern California; Kiran Kumar Matam:University of Southern California; Te I:North Carolina State University; H. V. Krishna Giri Narra:University of Southern California; Jing Li:University of California. San Diego; Hung-Wei Tseng:North Carolina State University; Steven Swanson:University of California. San Diego; Murali Annavaram:University of Southern California
  SCRATCH: An End-to-end Application-aware Soft-GPGPU Architecture and Trimming Tool
Pedro Duarte:Instituto de Telecomunicacoes. University of Coimbra. Portugal; Pedro Tomas:INESC-ID. Instituto Superior Tecnico. Universidade de Lisboa. Portugal; Gabriel Falcao:Instituto de Telecomunicacoes. University of Coimbra. Portugal
Memory Cocktail Therapy: A General Learning-Based Framework to Optimize Dynamic Tradeoffs in NVMs
Zhaoxia Deng:University of California. Santa Barbara; Lunkai Zhang:University of Chicago; Nikita Mishra:University of Chicago; Henry Hoffmann:University of Chicago; Fred Chong:University of Chicago
16:00  Transfer to Boston Aquarium  
17:00  Evening Panel and Social Event   
Start time Tuesday Oct 17, 2017
7:00 Breakfast

Keynote 2: Doug Burger Distinguished Engineer (MSR NExT), Microsoft
Specialization and Accelerated AI at Hyperscale

We are in a new era of compute architecture, with specialization and heterogenity growing rapidly to meet computing requirements. The tension between scale, economics, and specialization is driving fierce debates and many experiments about the right path forward. One of Microsoft's major efforts in this space has been to move to programmable hardware at global scale, to balance flexibility and efficiency as workloads--many of which need to be specialized--continue to evolve rapidly. This has led to a new hyperscale architecture that we call a Configurable Cloud. This talk will cover the reasons that Microsoft eventually chose this design, and will include the earlier failed attempts. The talk will also show how this system can be used for accelerating large-scale services, in particular deep learning via the recently announced Project Brainwave platform. The talk will conclude with some promising directions for both specialized architectures as a broad class and artificial intelligence workloads as an important specific class.

Bio: Doug Burger is a Distinguished Engineer at Microsoft, where he leads several research-to-production projects aimed at transforming the computing architecture of Microsoft's systems and devices. With Derek Chiou, he co-leads the Catapult project, which is incorporating FPGA technology at large scale into Microsoft's cloud architecture. His team also architected the Brainwave system, which is serving accelerated deep learning at large scale within Microsoft's cloud. Before joining Microsoft in 2008, he spent ten years on the Computer Science faculty at the University of Texas at Austin, where he co-led the TRIPS project with Steve Keckler. He is the recipient of the 2006 Maurice Wilkes Award, an IEEE Fellow, an ACM Fellow, an ex-athlete, and an avid father.


Lightning Session II (1.5 min/paper)

10:00 Break  
10:20 3-A In/Near Memory Computing. David Wood, U. Wisconsin 3-B Security. Chris Fletcher, UIUC
  A Many-core Architecture for In-Memory Data Processing
Sandeep R Agrawal:Oracle Labs; Sam Idicula:Oracle Labs; Arun Raghavan:Oracle Labs; Evangelos Vlachos:Oracle Labs; Venkatraman Govindaraju:Oracle Labs; Venkatanathan Varadarajan:Oracle Labs; Cagri Balkesen:Oracle Labs; Georgios Giannikis:Oracle Labs; Charlie Roth:NXP; Nipun Agarwal:Oracle Labs; Eric Sedlar:Oracle Labs
RHMD: Evasion-Resilient Hardware Malware Detectors
Khaled N. Khasawneh:University of California. Riverside; Nael Abu-Ghazaleh:University of California. Riverside; Dmitry Ponomarev:Binghamton University; Lei Yu:Binghamton University
  Cache Automaton
Arun Subramaniyan:University of Michigan; Jingcheng Wang:University of Michigan; Ezhil R. M. Balasubramanian:University of Michigan; David Blaauw:University of Michigan; Dennis Sylvester:University of Michigan; Reetuparna Das:University of Michigan
Software-based Gate-level Information Flow Security for IoT Systems
Hari Cherupalli:University of Minnesota; Henry Duwe:University of Illinois; Weidong Ye:University of Illinois; Rakesh Kumar:University of Illinois; John Sartori:University of Minnesota
  Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology
Vivek Seshadri:Microsoft Research India; Donghyuk Lee:NVIDIA Research; Thomas Mullins:Intel; Hasan Hassan:ETH Zurich; Amirali Boroumand:CMU; Jeremie Kim:ETH Zurich; Michael A. Kozuch:Intel; Onur Mutlu:ETH Zurich; Phillip B. Gibbons:CMU; Todd C. Mowry:CMU
How secure is your cache against side-channel attacks?
Zecheng He:Princeton University; Ruby B. Lee:Princeton University
  DRISA: A DRAM-based Reconfigurable In-Situ Accelerator
Shuangchen Li:University of California. Santa Barbara; Dimin Niu:Samsung; Krishna T. Malladi:Samsung; Hongzhong Zheng:Samsung; Bob Brennan:Samsung; Yuan Xie:University of California. Santa Barbara
Constructing and Characterizing Covert Channels on GPGPUs
Hoda Naghibijouybari:University of California. Riverside; Khaled Khasawneh:University of California. Riverside; Nael Abu-Ghazaleh:University of California. Riverside
  PageForge: A Near-Memory Content-Aware Page-Merging Architecture
Dimitrios Skarlatos:University of Illinois at Urbana-Champaign; Nam Sung Kim:University of Illinois at Urbana-Champaign; Josep Torrellas:University of Illinois at Urbana-Champaign

12:00   Lunch
13:00  Awards Presentations
14:00   Break
14:30 4-A Deep Learning. Vivienne Sze, MIT 4-B Prediction. Alaa Alameldeen, Intel
  Scale-Out Acceleration for Machine Learning 
Jongse Park:Georgia Institute of Technology; Hardik Sharma:Georgia Institute of Technology; Divya Mahajan:Georgia Institute of Technology; Joon Kyung Kim:Georgia Institute of Technology; Hadi Esmaeilzadeh:University of California, San Diego
Using Branch Predictors to Predict Brain Activity in Brain-Machine Implants
Abhishek Bhattacharjee:Rutgers University/Princeton University
  Bit-Pragmatic Deep Neural Network Computing
Jorge Albericio:NVIDIA; Patrick Judd:University of Toronto; Alberto Delmas:University of Toronto; Sayeh Sharify:University of Toronto; Gerard O'Leary:University of Toronto; Roman Genov:University of Toronto; Andreas Moshovos:University of Toronto
Load Value Prediction via Path-based Address Prediction: Avoiding Mispredictions due to Conflicting Stores
Rami Sheikh:Qualcomm Technologies. Inc.; Harold W. Cain:Qualcomm Datacenter Technologies. Inc.; Raguram Damodaran:Qualcomm Technologies. Inc.
  CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices
Caiwen Ding:Syracuse University; Siyu Liao:City University of New York. City College; Yanzhi Wang:Syracuse University;Zhe Li:Syracuse University; Ning Liu:Syracuse University; Youwei Zhuo:University of Southern California; Chao Wang:University of Southern California;Xuehai Qian:University of Southern California; Yu Bai:California State University Fullerton; Geng Yuan:Syracuse University; Xiaolong Ma:Syracuse University; Yipeng Zhang:Syracuse University; Jian Tang:Syracuse University; Qinru Qiu:Syracuse University; Xue Lin:Northeastern University; Bo Yuan:City University of New York. City College
Multiperspective Reuse Prediction
Daniel A. JimŽnez:Texas A&M; Elvira Teran:Texas A&M
15:30  Break 
15:50 5-A Consistency/Coherency Translation. Abhishek Bhatacharjee, Rutgers 5-B Energy. Bobbie Manne, Cavium
  CSALT: Context Switch Aware Large TLB
Yashwant Marathe:The University of Texas at Austin; Nagendra Gulur:Texas Instruments; Jee Ho Ryoo:The University of Texas at Austin; Shuang Song:The University of Texas at Austin; Lizy K. John:The University of Texas at Austin
Harnessing Voltage Margins for Energy Efficiency in Multicore CPUs
George Papadimitriou:University of Athens; Manolis Kaliorakis:University of Athens; Athanasios Chatzidimitriou:University of Athens; Dimitris Gizopoulos:University of Athens; Peter Lawthers:AppliedMicro; Shidhartha Das:ARM
  RTLCheck: Verifying the Memory Consistency of RTL Designs
Yatin A. Manerkar:Princeton University; Daniel Lustig:NVIDIA; Margaret Martonosi:Princeton University; Michael Pellauer:NVIDIA
Race-To-Sleep + Content Caching + Display Caching: A Recipe for Energy-efficient Video Streaming on Handhelds
Haibo Zhang:Penn State; Prasanna Venkatesh Rengasamy:Penn State; Shulin Zhao:Penn State; Nachiappan Chidambaram Nachiappan:Penn State; Anand Sivasubramaniam:Penn State; Mahmut Kandemir:Penn State; Ravi Iyer:Intel; Chita R. Das:Penn State
  Architecting Hierarchical Coherence Protocols for Push-button Parametric Verification
Opeoluwa Matthews:Duke University; Daniel J. Sorin:Duke University
BVF: Enabling Significant On-Chip Power Savings via Bit-Value-Favor for Throughput Processors
Ang Li:Pacific Northwest National Laboratory; Wenfeng Zhao:University of Minnesota; Shuaiwen Leon Song:Pacific Northwest National Laboratory
  PARSNIP: Performant Architecture for Race Safety with No Impact on Precision
Yuanfeng Peng:University of Pennsylvania; Benjamin P. Wood:Wellesley College; Joseph Devietti:University of Pennsylvania
Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks
Aditya Agrawal:UIUC; Josep Torrellas:UIUC; Sachin Idgunji:Nvidia
17:20  Break and Poster Session  
18:20  Business Meeting  
Start time Wednesday Oct 18, 2017
7:00 Breakfast
8:00 6-A GPUs-II. Michael Pellauer, Nvidia 6-B OS and System Design. Nathan Beckman, CMU
  Unleashing the Power of GPU for Physically Based Rendering via Dynamic Ray Shuffling
Yashuai Lv: Astronautical Engineering University; Libo Huang:National University of Defense Technology; Li Shen:National University of Defense Technology; Zhiying Wang:National University of Defense Technology
SchedTask: A Hardware-Assisted Task Scheduler
Prathmesh Kallurkar:IIT Delhi; Smruti R. Sarangi:IIT Delhi
  GPUpd: A Fast and Scalable Multi-GPU Architecture Using Cooperative Projection and Distribution
Youngsok Kim:Seoul National University;Jae-Eon Jo:POSTECH;Hanhwi Jang:POSTECH;Minsoo Rhu:POSTECH;Hanjun Kim:POSTECH;Jangwoo Kim:Seoul National University
Exploiting Heterogeneity for Tail Latency and Energy Efficiency
Md E. Haque:Rutgers University; Yuxiong He:Microsoft Research; Sameh Elnikety:Microsoft Research; Thu D. Nguyen:Rutgers University; Ricardo Bianchini:Microsoft Research; Kathryn S. McKinley:Google
  VersaPipe: A Versatile Programming Framework for Pipelined Computing on GPU
Zhen Zheng:Tsinghua University; Chanyoung Oh:University of Seoul; Jidong Zhai:Tsinghua University; Xipeng Shen:North Carolina State University; Youngmin Yi:University of Seoul; Wenguang Chen:Tsinghua University
TMI: Thread Memory Isolation for False Sharing Repair
Christian DeLozier:University of Pennsylvania; Ariel Eizenberg:University of Pennsylvania; Shiliang Hu:Intel; Gilles Pokam:Intel; Joseph Devietti:University of Pennsylvania
  WIREFRAME: Supporting Data-dependent Parallelism through Dependency Graph Execution in GPUs
AmirAli Abdolrashidi:University of California Riverside; Devashree Tripathy:University of California Riverside; Mehmet Esat Belviranli:Oak Ridge National Laboratories; Daniel Wong:University of California Riverside; Laxmi Narayan Bhuyan:University of California Riverside
Estimating and Understanding Architectural Risk
Weilong Cui:University of California. Santa Barbara; Timothy Sherwood:University of California. Santa Barbara
9:20  Break
9:30 7-A Unconventional Architectures. Hadi Esmailzadeh, Georgia Tech 7-B Compilers and Microarchitecture. Milos Prvulovic, Georgia Tech
  Hybrid Analog-Digital Solution of Nonlinear Partial Differential Equations
Yipeng Huang:Columbia University; Ning Guo:Columbia University; Mingoo Seok:Columbia University; Yannis Tsividis:Columbia University; Kyle Mandli:Columbia University; Simha Sethumadhavan:Columbia University
Improving the Effectiveness of Searching for Isomorphic Chains in Superword Level Parallelism
Joonmoo Huh:North Carolina State University; James Tuck:North Carolina State University
  Taming the Instruction Bandwidth of Quantum Computers via Hardware-Managed Error Correction
Swamit S. Tannu:Georgia Tech; Zachary A. Myers:Stanford; Prashant J. Nair:Georgia Tech; Douglas M. Carmean:Microsoft; Moinuddin K. Qureshi:Georgia Tech
Data Movement Aware Computation Partitioning
Xulong Tang:Penn State; Orhan Kislal:Penn State; Mahmut Kandemir:Penn State; Mustafa Karakoy:TOBB University
  Optimized Surface Code Communication in Superconducting Quantum Computers
Ali Javadi-Abhari:Princeton University; Pranav Gokhale:University of Chicago; Adam Holmes:University of Chicago; Diana Franklin:University of Chicago; Kenneth R. Brown:Georgia Institute of Technology; Margaret Martonosi:Princeton University; Frederic T. Chong:University of Chicago
Mirage Cores: The Illusion of Many Out-of-order Cores Using In-order Hardware
Shruti Padmanabha:University of Michigan; Andrew Lukefahr:University of Michigan; Reetuparna Das:University of Michigan; Scott Mahlke:University of Michigan
  Architectural Tradeoffs for Biodegradable Computing
Ting-Jung Chang:Princeton University; Zhuozhi Yao:Princeton University; Paul J. Jackson:Princeton University; Barry P. Rand:Princeton University; David Wentzlaff:Princeton University
Using Intra-Core Loop-Task Accelerators to Improve the Productivity and Performance of Task-Based Parallel Programs
Ji Kim:Cornell University; Shunning Jiang:Cornell University; Christopher Torng:Cornell University; Moyang Wang:Cornell University; Shreesha Srinath:Cornell University; Berkin Ilbeyi:Cornell University; Khalid Al-Hawaj:Cornell University; Christopher Batten:Cornell University
11:10  Best Paper Nominees. Margaret Martonosi, Princeton
Architectural Opportunities for Novel Dynamic EMI Shifting (DEMIS). Daphne I. Gorman:UC Santa Cruz; Jose Renau:UC Santa Cruz; Matthew Guthaus:UC Santa Cruz.
DeftNN: Addressing Bottlenecks for DNN Execution on GPUs via Synapse Vector Elimination and Near-compute Data Fission. Parker Hill:University of Michigan; Animesh Jain:University of Michigan; Mason Hill:University of Nevada, Las Vegas; Babak Zamirai:University of Michigan; Chang-Hong Hsu:University of Michigan; Michael Laurenzano:University of Michigan; Scott Mahlke:University of Michigan; Lingjia Tang:University of Michigan; Jason Mars:University of Michigan.
Hardware Supported Persistent Object Address Translation. Tiancong Wang:NC State University; Sakthikumaran Sambasivam:NC State University; Yan Solihin:NC State University; James Tuck:NC State University.
An Experimental Microarchitecture for a Superconducting Quantum Processor. X. Fu: QuTech, Delft University of Technology, Computer Engineering Lab, Delft University of Technology; M. A. Rol: QuTech, Delft University of Technology / Kavli Institute of Nanoscience, Delft University of Technology; C. C. Bultink: QuTech, Delft University of Technology / Kavli Institute of Nanoscience, Delft University of Technology; H. van Someren: QuTech, Delft University of Technology, Computer Engineering Lab, Delft University of Technology; N. Khammassi: QuTech, Delft University of Technology, Computer Engineering Lab, Delft University of Technology; I. Ashraf: QuTech, Delft University of Technology, Computer Engineering Lab, Delft University of Technology; R. F. L. Vermeulen: QuTech, Delft University of Technology / Kavli Institute of Nanoscience, Delft University of Technology; J. C. de Sterke: Topic Embedded Systems / QuTech, Delft University of Technology; W. J. Vlothuizen: Netherlands Organisation for Applied Scienti c Research (TNO) / QuTech, Delft University of Technology; R. N. Schouten: QuTech, Delft University of Technology / Kavli Institute of Nanoscience, Delft University of Technology; C. G. Almudever: QuTech, Delft University of Technology, Computer Engineering Lab, Delft University of Technology; L. DiCarlo: QuTech, Delft University of Technology; Kavli Institute of Nanoscience, Delft University of Technology; K. Bertels: QuTech, Delft University of Technology, Computer Engineering Lab, Delft University of Technology.
12:30 Best Paper Announcement
12:45 Adjourn