MICRO-49 Homepage

Workshops & Tutorials　Floor Plan

October 15, 2016 (Saturday)
Time / Room	406	405	403	402
7:30-8:30 Breakfast	Registration desk opens (until 17:30) at 4F in the conference hotel
8:30-10:30	NoCArc (9th International Workshop on Network on Chip Architectures) Organizers: Maurizio Palesi, Masoud Daneshtalab, Xiaohang Wang	Accel (Tutorial on Rapid Exploration of Accelerator-rich Architectures: Automation from Concept to Prototyping) Organizers: Jason Cong, Zhenman Fang, Yakun Sophia Shao	GPGPU (Tutorial on Intel Graphics Architecture: ISA and Microarchitecture) Organizers: Subramaniam Maiyuran, Jason Ross and Ken Lueh	Tejas (Tejas: a versatile Java based architectural simulator) Organizers:Prathmesh Kallurkar and Smruti Sarangi
10:30-11:00 Coffee Break
11:00-12:00
12:00-13:30 Lunch (4F)
13:30-15:00			BigBench+SAE: Instrumenting an Industry-Standard BigData Benchmark for BigData Analytics Organizers: Vijay Janapa Reddi, Nadav Chachmon, Magnus Christensson, Daniel Richins	NOPE (2nd Workshop on Negative Outcomes, Post-mortems, and Experiences) Organizers: Bob Adolf, Svilen Kanev, Brandon Reagen
15:00-15:30 Coffee Break
15:30-17:30
October 16, 2016 (Sunday)
Time / Room	403 (Streaming to 401/402)	405	406
7:30-8:30 Breakfast	Registration desk opens (until 17:30) at 4F in the conference hotel
8:30-10:30	HW-ML (Tutorial on Hardware Architectures for Deep Neural Networks) Organizers: Joel Emer, Vivienne Sze, and Yu-Hsin Chen	MemoryTech (Tutorial on Existing and Emerging Memory Technologies and Circuits) Organizers: Meng-Fan (Marvin) Chang, Yue-Der Chih, Helia Naeimi, Darsen Lu, Shih-Lien Lu, Dinesh Somasekhar and Shigeki Tomishima	IoT (Cognitive Edge Computing) Organizers: Ravi Iyer, Vijay Janapa Reddi and Shiao-Li Tsao
10:30-11:00 Coffee Break
11:00-12:00
12:00-13:30 Lunch
13:30-15:00
15:00-15:30 Coffee Break
15:30-17:30

Main Program　Floor Plan

For the best experience of browsing the MICRO program on your mobile phone, please use the Conference MobileApp.

October 16, 2016 (Sunday)
18:00-20:00	Reception (Finger foods, soft drinks, Taiwan beers, alcoholic drinks provided. Registration desk opens 17:30-20:00 at B2) Please read this page for more information
October 17, 2016 (Monday)
7:00-8:00	Breakfast Registration desk opens (until 17:30 at B2)
8:00-8:20	Opening remarks
	Hall I
8:20-9:20	Keynote I: Internet of Things: History and Hype, Technology and Policy Margaret Martonosi (Princeton) Chair: Mikko Lipasti Abstract The idea of an emerging Internet of Things (IoT) is currently captivating both technologists and society at large. Although IoT techniques have their roots in ideas that are decades old, their increasingly widespread deployments have made them a hot topic these days, frequently discussed and hyped. As many as 50B networked devices are envisioned by 2020, and proponents of IoTs see a world where embedded sensing and control techniques help vehicle traffic flow more smoothly, where environmental sensing and data analysis facilitates better use of natural resources like water, and where personalized health monitoring helps individuals improve their quality of life. On the other hand, properly addressing policy concerns around security and privacy may play a role in IoT's adoption and success. My talk will discuss key technology and policy challenges for future IoT applications and devices. Overall, I will be drawing from both technical experiences and trends, as well as from policy perspectives gained during a one year fellowship doing technology policy within the U. S. Department of State. Bio Margaret Martonosi is the Hugh Trumbull Adams '35 Professor of Computer Science at Princeton University, where she has been on the faculty since 1994. From August 2015-2016, she served as a Jefferson Science Fellow doing international aspects of technology policy within the U. S. Department of State. Martonosi's technical research focuses on computer architecture and mobile computing, particularly power-efficient systems. Past projects include the Wattch power modeling tool used by thousands of engineers worldwide, and the ZebraNet mobile sensor network, which was deployed for wildlife tracking in Kenya. Martonosi holds affiliated appointments in Princeton's Electrical Engineering Department, its Center for Information Technology Policy, its Environmental Institute, and its Andlinger Center for Energy and the Environment. From 2005-2007, she served as Associate Dean for Academic Affairs for the School of Engineering and Applied Science. From 2016-2022, she holds (in addition to her primary position at Princeton) a visiting position as Andrew Dickson White Visiting Professor-At-Large at Cornell University. Martonosi is a Fellow of both IEEE and ACM. Her major awards include Princeton University's 2010 Graduate Mentoring Award, the Anita Borg Institute's 2013 Technical Leadership Award, NCWIT's 2013 Undergraduate Research Mentoring Award, the 2015 Marie Pistilli Women in EDA Achievement Award, and ISCA's 2015 Long-Term Influential Paper Award. Martonosi is an inventor on seven granted US patents, and has co-authored two technical reference books on power-aware computer architecture. She serves on the Board of Directors of the Computing Research Association (CRA).
9:20-10:00	Lightning Session I Chair: Moin Qureshi
10:00-10:20	Break
	Hall III	Hall I
10:20-12:00	Session 1a: Microarchitecture Chair: Minsoo Rhu	Session 1b: Cloud & Storage Chair: Babak Falsafi
10:20-12:00	Dictionary Sharing: An Efficient Cache Compression Scheme for Compressed Caches, Biswabandan Panda (INRIA), André Seznec (INRIA) Perceptron Learning for Reuse Prediction, Elvira Teran (Texas A&M University), Zhe Wang (Intel), Daniel A. Jiménez (Texas A&M University) pTask: A Smart Prefetching Scheme for OS Intensive Applications, Prathmesh Kallurkar (Indian Institute of Technology, New Delhi), Smruti R. Sarangi (Indian Institute of Technology, New Delhi) Register Sharing for Equality Prediction, Arthur Perais (INRIA/IRISA), Fernando A. Endo (INRIA/IRISA), André Seznec (INRIA/IRISA) Data-Centric Execution of Speculative Parallel Programs, Mark C. Jeffrey (MIT), Suvinay Subramanian (MIT), Maleen Abeydeera (MIT), Joel Emer (NVIDIA/MIT), Daniel Sanchez (MIT)	SABRes: Atomic Object Reads for In-Memory Rack-Scale Computing, Alexandros Daglis (EPFL), Dmitrii Ustiugov (EPFL), Stanko Novaković (EPFL), Edouard Bugnion (EPFL), Babak Falsafi (EPFL), Boris Grot (University of Edinburgh) A Cloud-Scale Acceleration Architecture, Adrian M. Caulfield (Microsoft), Eric S. Chung (Microsoft), Andrew Putnam (Microsoft), Hari Angepat (Microsoft), Jeremy Fowers (Microsoft), Michael Haselman (Microsoft), Stephen Heil (Microsoft), Matt Humphrey (Microsoft), Puneet Kaur (Microsoft), Joo-Young Kim (Microsoft), Daniel Lo (Microsoft), Todd Massengill (Microsoft), Kalin Ovtcharov (Microsoft), Michael Papamichael (Microsoft), Lisa Woods (Microsoft), Sitaram Lanka (Microsoft), Derek Chiou (Microsoft), Doug Burger (Microsoft) Towards Efficient Server Architecture for Virtualized Network Function Deployment: Implications and Implementations, Yang Hu (University of Florida), Tao Li (University of Florida) Bridging the I/O Performance Gap for Big Data Workloads: A New NVDIMM-based Approach, Renhai Chen (The Hong Kong Polytechnic University), Zili Shao (The Hong Kong Polytechnic University), Tao Li (University of Florida) NeSC: Self-Virtualizing Nested Storage Controller, Yonatan Gottesman (Technion-Israel Institute of Technology), Yoav Etsion (Technion-Israel Institute of Technology)
12:00-14:00	Lunch (3F Yangtse River & Dragon Hall)
	Hall I
14:00-15:40	Poster session
15:40-16:00	Break
	Hall III	Hall I
16:00-18:00	Session 2a: GPU Chair: Hyeran Jeon	Session 2b: Neural Networks Chair: Emre Ozer
16:00-18:00	MIMD Synchronization on SIMT Architectures, Ahmed ElTantawy (University of British Columbia), Tor M. Aamodt (University of British Columbia) Efficient Kernel Synthesis for Performance Portable Programming, Li-Wen Chang (University of Illinois at Urbana-Champaign), Izzat El Hajj (University of Illinois at Urbana-Champaign), Christopher Rodrigues (Huawei), Juan Gómez-Luna (University of Córdoba), Wen-mei Hwu (University of Illinois at Urbana-Champaign) KLAP: Kernel Launch Aggregation and Promotion for Optimizing Dynamic Parallelism, Izzat El Hajj (University of Illinois at Urbana-Champaign), Juan Gómez-Luna (University of Córdoba), Cheng Li (University of Illinois at Urbana-Champaign), Li-Wen Chang (University of Illinois at Urbana-Champaign), Dejan Milojicic (Hewlett-Packard), Wen-mei Hwu (University of Illinois at Urbana-Champaign) Cache-Emulated Register File: An Integrated On-Chip Memory Architecture for High Performance GPGPUs, Naifeng Jing (Shanghai Jiao Tong University), Jianfei Wang (Shanghai Jiao Tong University), Fengfeng Fan (Shanghai Jiao Tong University), Wenkang Yu (Shanghai Jiao Tong University), Li Jiang (Shanghai Jiao Tong University), Chao Li (Shanghai Jiao Tong University), Xiaoyao Liang (Shanghai Jiao Tong University) Zorua: A Holistic Approach to Resource Virtualization in GPUs, Nandita Vijaykumar (Carnegie Mellon University), Kevin Hsieh (Carnegie Mellon University), Gennady Pekhimenko (Microsoft and Carnegie Mellon University), Samira Khan (University of Virginia), Ashish Shrestha (Carnegie Mellon University), Saugata Ghose (Carnegie Mellon University), Adwait Jog (College of William and Mary), Phillip B. Gibbons (Carnegie Mellon University), Onur Mutlu (ETH Zürich and Carnegie Mellon University) GRAPE: Minimizing Energy for GPU Applications with Performance Requirements, Muhammad Husni Santriaji (Surya University & University of Chicago), Henry Hoffmann (University of Chicago)	From High-Level Deep Neural Models to FPGAs, Hardik Sharma (Georgia Institute of Technology), Jongse Park (Georgia Institute of Technology), Divya Mahajan (Georgia Institute of Technology), Emmanuel Amaro (Georgia Institute of Technology), Joon Kyung Kim (Georgia Institute of Technology), Chenkai Shao (Georgia Institute of Technology), Asit Mishra (Intel), Hadi Esmaeilzadeh (Georgia Institute of Technology) vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design, Minsoo Rhu (NVIDIA), Natalia Gimelshein (NVIDIA), Jason Clemons (NVIDIA), Arslan Zulfiqar (NVIDIA), Stephen W. Keckler (NVIDIA) Stripes: Bit-Serial Deep Neural Network Computing, Patrick Judd (University of Toronto), Jorge Albericio (University of Toronto), Tayler Hetherington (University of British Columbia), Tor M. Aamodt (University of British Columbia), Andreas Moshovos (University of Toronto) Cambricon-X: An Accelerator for Sparse Neural Networks, Shijin Zhang (Chinese Academy of Sciences), Zidong Du (Chinese Academy of Sciences), Lei Zhang (Chinese Academy of Scienses), Huiying Lan (Chinese Academy of Sciences), Shaoli Liu (Chinese Academy of Sciences), Ling Li (Chinese Academy of Sciences), Qi Guo (Chinese Academy of Sciences), Tianshi Chen (Chinese Academy of Sciences), Yunji Chen (Chinese Academy of Sciences) NEUTRAMS: Neural Network Transformation and Co-design under Neuromorphic Hardware Constraints, Yu Ji (Tsinghua University), YouHui Zhang (Tsinghua University), ShuangChen Li (University of California, Santa Barbara), Ping Chi (University of California, Santa Barbara), CiHang Jiang (Tsinghua University), Peng Qu (Tsinghua University), Yuan Xie (University of California, Santa Barbara), WenGuang Chen (Tsinghua University) Fused-Layer CNN Accelerators, Manoj Alwani (Stony Brook University), Han Chen (Stony Brook University), Michael Ferdman (Stony Brook University), Peter Milder (Stony Brook University)
18:20-20:00	Business meeting
October 18, 2016 (Tuesday)
7:30-8:30	Breakfast Registration desk opens (until 17:30 at B2)
	Hall I
8:30-9:30	Keynote II: Low Power CPU: From Mobile to Wearable & IoT Uming Ko (MediaTek) Chair: Hsien-Hsin Lee Abstract With the landmark introduction of Smartphone in 2007, Mobile Internet and computing took off and the associated data bandwidth has ever-since grown exponentially resulting in the ever-increasing computing requirements. However, mobile CPU will soon hit the frequency and thermal limits. Thus, mobile clients are rapidly moving to multi-core CPU/GPU with system-adaptive power management, thermal throttling, and heterogeneous multi-processing. The insatiable computation need, coupled with the explosion of Internet-of-Things (IoT) that demands long battery operation, further presents major thermal and energy gaps. Consequently, many innovations are desperately needed to enable the ubiquitous ecosystem that promises to provide ample possibilities to enhance and enrich everyone's life. Bio Uming Ko is the Vice President of Technology and General Manager of the High-performance Processors Technology at MediaTek Inc., Taiwan. He received his B.S. degree in Electrical Engineering from National Tsing-Hua University, Taiwan and his M.S. and Ph.D. degrees from the University of Texas. He first worked at AMD and then joined Texas Instrument (TI) in 1986. He was elected to TI Fellow in 2000 for his innovation on industry's most energy-efficient DSP processor -- the TMS320C55x DSP. In 2005, Dr. Ko was recognized as a TI Senior Fellow for his technical leadership in ultra-low-power design and innovations of SmartReflex Power and Performance Management technology. At MediaTek, he oversees and leads the development of high-performance processors. Dr. Ko has published 44 technical papers and holds 48 US patents. He received 12 national-level and industrial awards including the 2006 Asian American Engineer of the Year Award in the US. He is a Fellow of the IEEE.
9:30-10:10	Lightning Session II Chair: Yuan Xie
10:10-10:30	Break
	Hall III	Hall I
10:30-12:10	Session 3a: Compilation & Memory Chair: Samira Khan	Session 3b: Interconnect Chair: Sreenivas Subramoney
10:30-12:10	Continuous Shape Shifting: Enabling Loop Co-optimization via Near-Free Dynamic Code Rewriting, Animesh Jain (University of Michigan, Ann Arbor), Michael A. Laurenzano (University of Michigan, Ann Arbor), Lingjia Tang (University of Michigan, Ann Arbor), Jason Mars (University of Michigan, Ann Arbor) CrystalBall: Statically Analyzing Runtime Behavior via Deep Sequence Learning, Stephen Zekany (University of Michigan, Ann Arbor), Daniel Rings (University of Michigan, Ann Arbor), Nathan Harada (University of Michigan, Ann Arbor), Michael A. Laurenzano (University of Michigan, Ann Arbor; Clinc), Lingjia Tang (University of Michigan, Ann Arbor; Clinc), Jason Mars (University of Michigan, Ann Arbor; Clinc) Low-Cost Soft Error Resilience with Unified Data Verification and Fine-Grained Recovery for Acoustic Sensor Based Detection, Qingrui Liu (Virginia Tech), Changhee Jung (Virginia Tech, Blacksburg), Dongyoon Lee (Virginia Tech, Blacksburg), Devesh Tiwari (Oak Ridge National Lab) Lazy Release Consistency for GPUs, Johnathan Alsop (University of Illinois at Urbana-Champaign), Marc S. Orr (University of Wisconsin - Madison and AMD), Bradford M. Beckmann (AMD), David A. Wood (University of Wisconsin - Madison and AMD) Improving Energy Efficiency of DRAM by Exploiting Half Page Row Access, Heonjae Ha (Stanford University), Ardavan Pedram (Stanford University and Movidius), Stephen Richardson (Stanford University), Shahar Kvatinsky (Technion-Israel Institute of Technology), Mark Horowitz (Stanford University)	OSCAR: Orchestrating STT-RAM Cache Traffic for Heterogeneous CPU-GPU Architectures, Jia Zhan (University of California, Santa Barbara), Onur Kayiran (Advanced Micro Devices), Gabriel H. Loh (Advanced Micro Devices), Chita R. Das (The Pennsylvania State University), Yuan Xie (University of California, Santa Barbara) A Unified Memory Network Architecture for In-Memory Computing in Commodity Servers, Jia Zhan (University of California, Santa Barbara), Itir Akgun (University of California, Santa Barbara), Jishen Zhao (Univeristy of California, Santa Cruz), Al Davis (HP), Paolo Faraboschi (HP), Yuangang Wang (Huawei), Yuan Xie (University of California, Santa Barbara) Contention-based Congestion Management in Large-Scale Networks, Gwangsun Kim (KAIST), Changhyun Kim (KAIST), Jiyun Jeong (KAIST), Mike Parker (Intel), John Kim (KAIST) Dynamic Error Mitigation in NoCs using Intelligent Prediction Techniques, Dominic DiTomaso (Ohio University), Travis Boraten (Ohio University), Avinash Kodi (Ohio University), Ahmed Louri (George Washington University) Reducing Data Movement Energy via Online Data Clustering and Encoding, Shibo Wang (University of Rochester), Engin Ipek (University of Rochester)
12:10-14:10	Award Lunch (including Bob Rau Award, Test of Time) (B1 Formosa)
	Hall III	Hall I
14:10-15:30	Session 4a: Multicore Chair: Carole-Jean Wu	Session 4b: Security Chair: Koji Inoue
14:10-15:30	Racer: TSO Consistency via Race Detection, Alberto Ros (Universidad de Murcia), Stefanos Kaxiras (Uppsala Universitet) Exploiting Semantic Commutativity in Hardware Speculation, Guowei Zhang (MIT), Virginia Chiu (MIT), Daniel Sanchez (MIT) CANDY: Enabling Coherent DRAM Caches for Multi-Node Systems, Chiachen Chou (Georgia Institute of Technology), Aamer Jaleel (NVIDIA), Moinuddin K. Qureshi (Georgia Institute of Technology) C3D: Mitigating the NUMA Bottleneck via Coherent DRAM Caches, Cheng-Chieh Huang (University of Edinburgh), Rakesh Kumar (University of Edinburgh), Marco Elver (University of Edinburgh), Boris Grot (University of Edinburgh), Vijay Nagarajan (University of Edinburgh)	Quantifying and Improving the Efficiency of Hardware-based Mobile Malware Detectors, Mikhail Kazdagli (UT Austin), Vijay Janapa Reddi (UT Austin), Mohit Tiwari (UT Austin) PoisonIvy: Safe Speculation for Secure Memory, Tamara Silbergleit Lehman (Duke University), Andrew D. Hilton (Duke University), Benjamin C. Lee (Duke University) ReplayConfusion: Detecting Cache-based Covert Channel Attacks Using Record and Replay, Mengjia Yan (University of Illinois at Urbana Champaign), Yasser Shalabi (University of Illinois at Urbana Champaign), Josep Torrellas (University of Illinois at Urbana Champaign) Jump Over ASLR: Attacking Branch Predictors to Bypass ASLR, Dmitry Evtyushkin (Binghamton University), Dmitry Ponomarev (Binghamton University), Nael Abu-Ghazaleh (University of California, Riverside)
15:30-16:00	Break
	Hall III	Hall I
16:00-17:00	Session 5a: Approximate Computing Chair: Andreas Moshovos	Session 5b: Accelerators 1 Chair: Tao Li
16:00-17:00	Concise Loads and Stores: The Case for an Asymmetric Compute-Memory Architecture for Approximation, Animesh Jain (University of Michigan), Parker Hill (University of Michigan), Shih-Chieh Lin (University of Michigan), Muneeb Khan (Uppsala University), Md E. Haque (University of Michigan), Michael A. Laurenzano (University of Michigan), Scott Mahlke (University of Michigan), Lingjia Tang (University of Michigan), Jason Mars (University of Michigan) Approxilyzer: Towards A Systematic Framework for Instruction-Level Approximate Computing and its Application to Hardware Resiliency, Radha Venkatagiri (University of Illinois at Urbana Champaign), Abdulrahman Mahmoud (University of Illinois at Urbana Champaign), Siva Kumar Sastry Hari (NVIDIA), Sarita V. Adve (University of Illinois at Urbana Champaign), The Bunker Cache for Spatio-Value Approximation, Joshua San Miguel (University of Toronto), Jorge Albericio (University of Toronto), Natalie Enright Jerger (University of Toronto), Aamer Jaleel (NVIDIA)	HARE: Hardware Accelerator for Regular Expressions, Vaibhav Gogte (University of Michigan), Aasheesh Kolli (University of Michigan), Michael J. Cafarella (University of Michigan), Loris D'Antoni (University of Wisconsin-Madison), Thomas F. Wenisch (University of Michigan) The Microarchitecture of a Real-time Robot Motion Planning Accelerator, Sean Murray (Duke University), Will Floyd-Jones (Duke University), Ying Qi (Duke University), George Konidaris (Duke University), Daniel J. Sorin (Duke University) Efficient Data Supply for Hardware Accelerators with Prefetching and Access/Execute Decoupling, Tao Chen (Cornell University), G. Edward Suh (Cornell University)
18:00-21:00	Banquet Buses bound for the banquet venue will be prepared in front of Howard Hotel between 17:10 - 17:30. Return buses will depart from the banquet venue at 21:00 and arrive Howard Hotel at around 21:30 Please read this page for more information
October 19, 2016 (Wednesday)
7:00-8:00	Breakfast Registration desk opens (until 12:00 at B2)
	Hall III	Hall I
8:00-9:40	Session 6a: Accelerators 2 Chair: Ren-Shuo Liu	Session 6b: Mobile & Power Mgmt Chair: Jaewoong Sim
8:00-9:40	An Ultra Low-Power Hardware Accelerator for Automatic Speech Recognition, Reza Yazdani (Universitat Politecnica de Catalunya), Albert Segura (Universitat Politecnica de Catalunya), Jose-Maria Arnau (Universitat Politecnica de Catalunya), Antonio Gonzalez (Universitat Politecnica de Catalunya) Co-Designing Accelerators and SoC Interfaces using gem5-Aladdin, Yakun Sophia Shao (NVIDIA), Sam (Likun) Xi (Harvard University), Vijayalakshmi Srinivasan (IBM), Gu-Yeon Wei (Harvard University), David Brooks (Harvard University) CHAINSAW: Von-Neumann Accelerators to Leverage Fused Instruction Chains, Amirali Sharifan (Simon Fraser University), Snehasish Kumar (Simon Fraser University), Apala Guha (Simon Fraser University), Arrvindh Shriraman (Simon Fraser University) Chameleon: Versatile and Practical Near-DRAM Acceleration Architecture for Large Memory Systems, Hadi Asghari-Moghaddam (University of Illinois at Urbana-Champaign), Young Hoon Son (Seoul National University), Jung Ho Ahn (Seoul National University), Nam Sung Kim (University of Illinois at Urbana-Champaign)	A Patch Memory System For Image Processing and Computer Vision, Jason Clemons (NVIDIA), Chih-Chi Cheng (Qualcomm), Iuri Frosio (NVIDIA), Daniel Johnson (NVIDIA), Steve W. Keckler (NVIDIA) Evaluating Programmable Architectures for Imaging and Vision Applications, Artem Vasilyev (Stanford University), Nikhil Bhagdikar (Stanford University), Ardavan Pedram (Stanford University and Movidius), Stephen Richardson (Stanford University), Shahar Kvatinsky (Technion), Mark Horowitz (Stanford University) Redefining QoS and Customizing the Power Management Policy to Satisfy Individual Mobile Users, Kaige Yan (University of Houston), Xingyao Zhang (University of Houston), Jingweijia Tan (University of Houston), Xin Fu (University of Houston) Snatch: Opportunistically Reassigning Power Allocation between Processor and Memory in 3D Stacks, Dimitrios Skarlatos (University of Illinois at Urbana-Champaign), Renji Thomas (Ohio State University), Aditya Agrawal (NVIDIA), Shibin Qin (University of Illinois at Urbana-Champaign), Robert Pilawa-Podgurski (University of Illinois at Urbana-Champaign), Ulya R. Karpuzcu (University of Minnesota, Twin Cities), Radu Teodorescu (Ohio State University), Nam Sung Kim (University of Illinois at Urbana-Champaign), Josep Torrellas (University of Illinois at Urbana-Champaign) Ti-states: Processor Power Management in the Temperature Inversion Region, Yazhou Zu (University of Texas at Austin), Wei Huang (AMD), Indrani Paul (AMD), Vijay Janapa Reddi (University of Texas at Austin)
9:40-10:00	Break
	Hall I
10:00-12:00	Session 7: Best Paper Candidates Chairs: Mikko Lipasti, Hsien-Hsin Lee
10:00-12:00	Graphicionado: A High-Performance and Energy-Efficient Accelerator for Graph Analytics, Tae Jun Ham (Princeton University), Lisa Wu (University of California, Berkeley), Narayanan Sundaram (Intel), Nadathur Satish (Intel), Margaret Martonosi (Princeton University) Improving Bank-Level Parallelism for Irregular Applications, Xulong Tang (Pennsylvania State University, University Park), Mahmut Kandemir (Pennsylvania State University, University Park), Praveen Yedlapalli (VMware), Jagadish Kotra (Pennsylvania State University, University Park) Delegated Persist Ordering, Aasheesh Kolli (University of Michigan), Jeff Rosen (Snowflake Computing), Stephan Diestelhorst (ARM), Ali Saidi (ARM), Steven Pelley (Snowflake Computing), Sihang Liu (University of Michigan), Peter M. Chen (University of Michigan), Thomas F. Wenisch (University of Michigan) Spectral Profiling: Observer-Effect-Free Profiling by Monitoring EM Emanations, Nader Sehatbakhsh (Georgia Institute of Technology, Atlanta), Alireza Nazari (Georgia Institute of Technology, Atlanta), Alenka Zajic (Georgia Institute of Technology, Atlanta), Milos Prvulovic (Georgia Institute of Technology, Atlanta) Path Confidence based Lookahead Prefetching, Jinchun Kim (Texas A&M University), Seth H. Pugsley (Intel), Paul V. Gratz (Texas A&M University), A. L. Narasimha Reddy (Texas A&M University), Chris Wilkerson (Intel), Zeshan Chishti (Intel) Continuous Runahead: Transparent Hardware Acceleration for Memory Intensive Workloads, Milad Hashemi (The University of Texas at Austin), Onur Mutlu (ETH Zürich), Yale N. Patt (The University of Texas at Austin)
12:00-12:30	Session 8: Conference Closing and Best Paper Award
12:30-18:15	Conference Excursion: (includes sack lunch) Buses bound for the excursion place will be prepared in front of Howard Hotel at 12:30. Return buses will depart from the excursion place at 17:45 and arrive Howard Hotel at around 18:15

Lightning session

10/17 (Monday) 9:20am-10:00am
Session chair: Moin Qureshi
1. Dictionary Sharing: An Efficient Cache Compression Scheme for Compressed Caches
2. Perceptron Learning for Reuse Prediction
3. pTask: A Smart Prefetching Scheme for OS Intensive Applications
4. Register Sharing for Equality Prediction
5. Data-Centric Execution of Speculative Parallel Programs
6. SABRes: Fast Atomic Remote Object Reads for Rack-Scale In-Memory Computing
7. A Cloud-Scale Acceleration Architecture
8. Towards Efficient Server Architecture for Virtualized Network Function Deployment: Implications and Implementations
9. Bridging the I/O Performance Gap for Big Data Workloads: A New NVDIMM-based Approach
10. NeSC: Self Virtualizing Nested Storage Controller
11. MIMD Synchronization on SIMT Architectures
12. Efficient Kernel Synthesis for Performance Portable Programming
13. KLAP: Kernel Launch Aggregation and Promotion for Optimizing Dynamic Parallelism
14. Cache-Emulated Register File: An Integrated On-Chip Memory Architecture for High Performance GPGPUs
15. Zorua: A Holistic Approach to Resource Virtualization in GPUs
16. GRAPE: Minimizing Energy for GPU Applications with Performance Requirements
17. From High-Level Deep Neural Models to FPGAs
18. vDNN: Virtualized Deep Neural Networks for Scalable Memory-Efficient Neural Network Design
19. Stripes: Bit-Serial Deep Neural Network Computing
20. Cambricon-X: An Accelerator for Sparse Neural Networks
21. NEUTRAMS: Neural Network Transformation and Co-design under Neuromorphic Hardware Constraints
22. Fused-Layer CNN Accelerators
23. Continuous Shape Shifting: Enabling Loop Co-optimization via Near-Free Dynamic Code Rewriting
24. CrystalBall: Statically Analyzing Runtime Behavior via Deep Sequence Learning
25. Low-Cost Soft Error Resilience with Unified Data Verification and Fine-Grained Recovery for Acoustic Sensor Based Detection
26. Lazy Release Consistency for GPUs
27. Improving Energy Efficiency of DRAM by Exploiting Half Page Row Access
28. OSCAR: Orchestrating STT-RAM Cache Traffic in Heterogeneous Architectures
29. A Unified Memory Network Architecture for In-Memory Computing in Commodity Servers
30. Contention-based Congestion Management in Large-Scale Networks

10/18 (Tuesday) 9:30am-10:10am
Session chair: Yuan Xie
1. Dynamic Error Mitigation in NoCs using Intelligent Prediction Techniques
2. Reducing Data Movement Energy via Online Data Clustering and Encoding
3. Racer: TSO Consistency via Race Detection
4. Exploiting Semantic Commutativity in Hardware Speculation
5. CANDY: Enabling Coherent DRAM Caches for Multi-Node Systems
6. C3D: Mitigating NUMA Effects via Coherent DRAM Caches
7. Quantifying and Improving the Efficiency of Hardware-based Mobile Malware Detectors
8. PoisonIvy: Safe Speculation for Secure Memory
9. RePlayConfusion: Detecting LLC-based Covert Timing Channel Attacks Using Record and Replay
10. Jump Over ASLR: Attacking Branch Predictors to Bypass ASLR
11. Concise Loads and Stores: The Case for an Asymmetric Compute-Memory Architecture for Approximation
12. Approxilyzer: Towards A Systematic Framework for Instruction-Level Approximate Computing and its Application to Hardware Resiliency
13. The Bunker Cache for Spatio-Value Approximation
14. HARE: Hardware acceleration for regular expressions
15. The Microarchitecture of a Real-time Robot Motion Planning Accelerator
16. Efficient Data Supply for Hardware Accelerators with Prefetching and Access/Execute Decoupling
17. An Ultra Low-Power Hardware Accelerator for Automatic Speech Recognition
18. Co-Designing Accelerators and SoC Interfaces using gem5-Aladdin
19. CHAINSAW: Creating Von-Neumann Accelerators with Fused Instruction Chains
20. Chameleon: Versatile and Practical Near-DRAM Acceleration Architecture for Large Memory Systems
21. A Patch Memory System For Image Processing and Computer Vision
22. Evaluating Programmable Architectures for ISP and Computer Vision
23. Redefining QoS and Customizing the Power Management Policy to Satisfy Individual Mobile Users
24. Snatch: Opportunistically Reassigning Power Allocation between Processor and Memory in 3D Stacks
25. Ti states: Processor Power Management in the Temperature Inversion Region
26. Graphicionado: A High-Performance and Energy-Efficient Accelerator for Graph Analytics
27. Improving Bank-Level Parallelism for Irregular Applications
28. Delegated Persist Ordering
29. Spectral Profiling: Observer-Effect-Free Profiling by Monitoring EM Emanations
30. Path Confidence based Lookahead Prefetching
31. Continuous Runahead: Transparent Hardware Acceleration for Memory Intensive Workloads

Awards

Bob Rau Memorial Award: Guri Sohi (University of Wisconsin-Madison)
Micro Test of Time Award: B. Ramakrishna Rau
Micro Hall of Fame Awards:
- David Brooks(Harvard University)
- Boris Grot(University of Edinburgh)
- Moin Qureshi(Georgia Institute of Technology, Atlanta)
- David Wood(University of Wisconsin-Madison)
- Andre Seznec(IRISA/INRIA)
- Nam Sung Kim(University of Illinois, Urbana-Champaign)
ACM Distinguished Service Award: Yale Patt(The University of Texas at Austin)
Best Paper Award:
- Graphicionado: A High-Performance and Energy-Efficient Accelerator for Graph Analytics Tae Jun Ham (Princeton University), Lisa Wu (University of California, Berkeley), Narayanan Sundaram (Intel), Nadathur Satish (Intel), Margaret Martonosi (Princeton University)
- Spectral Profiling: Observer-Effect-Free Profiling by Monitoring EM Emanations Nader Sehatbakhsh (Georgia Institute of Technology, Atlanta), Alireza Nazari (Georgia Institute of Technology, Atlanta), Alenka Zajic (Georgia Institute of Technology, Atlanta), Milos Prvulovic (Georgia Institute of Technology, Atlanta)

Home	Program	Paper Submission	Call for Participation	Attending MICRO-49	About Taiwan	Committees	Prior MICROs
	Workshops & Tutorials Main Program Presentation and Poster Instructions for Authors Awards	Introduction Paper Preparation Instructions Submission Instructions Submission Site Final Paper Preparation and Submission Instructions	Call for Papers Call for Workshops/Tutorials Call for Papers of Workshops	Registration Venue Hotels Transportation Night Market Outing (Unofficial) Reception & Banquet Excursion Student Travel Grant Visa Visa for PRC Citizens	About Taipei About Taiwan Where to visit near Taipei	Organizing Committee Steering Committee Program Committee

Programs