MICRO logo

Annual IEEE/ACM International Symposium on Microarchitecture®

MICRO Test of Time Award

List of Eligible Papers for the 2021 Award

View the 2021 call for nominations.

MICRO 1999

Paper TitleAuthors
Control Independence in Trace ProcessorsEric Rotenberg, James E. Smith
Improving Branch Predictors by Correlating on Data ValuesTimothy H. Heil, Zak Smith, James E. Smith
Instruction Fetch Mechanisms for Multipath Execution ProcessorsArtur Klauser, Dirk Grunwald
A Superscalar 3D Graphics EngineAndrew Wolfe, Derek B. Noonburg
Dynamic 3D Graphics Workload Characterization and the Architectural ImplicationsTulika Mitra, Tzi-cker Chiueh
Exploiting a New Level of DLP in Multimedia ApplicationsJesus Corbal, Roger Espasa, Mateo Valero
Compiler-Driven Cached Code Compression Schemes for Embedded ILP ProcessorsSergei Y. Larin, Thomas M. Conte
Evaluation of a High Performance Code Compression MethodCharles Lefurgy, Eva Piccininni, Trevor N. Mudge
Low-Cost Branch Folding for Embedded Applications with Small Tight LoopsLea Hwang Lee, Jeff Scott, Bill Moyer, John Arends
Automatic and Efficient Evaluation of Memory Hierarchies for Embedded SystemsSantosh G. Abraham, Scott A. Mahlke
Hardware Identification of Cache Conflict MissesJamison D. Collins, Dean M. Tullsen
Access Region Locality for High-Bandwidth Processor Memory System DesignSangyeun Cho, Pen-Chung Yew, Gyungho Lee
Code Transformations to Improve Memory ParallelismVijay S. Pai, Sarita V. Adve
Compiler-Directed Dynamic Computation Reuse: Rationale and Initial ResultsDaniel A. Connors, Wen-mei W. Hwu
Dynamic Memory Disambiguation in the Presence of Out-of-Order Store IssuingSoner Onder, Rajiv Gupta
Read-After-Read Memory Dependence PredictionAndreas Moshovos, Gurindar S. Sohi
Delaying Physical Register Allocation through Virtual-Physical RegistersTeresa Monreal, Antonio González, Mateo Valero, José González, Victor Viñals
Exploiting ILP in Page-based Intelligent MemoryMark Oskin, Justin Hensley, Diana Keen, Frederic T. Chong, Matthew K. Farrens, Aneet Chopra
The Use of Multithreading for Exception HandlingCraig B. Zilles, Joel S. Emer, Gurindar S. Sohi
Value Prediction for Speculative Multithreaded ArchitecturesPedro Marcuello, Jordi Tubella, Antonio González
Predicting the Usefulness of a Block Result: A Micro-Architectural Technique for High-Performance Low-Power ProcessorsEnric Musoll
Wavefront Scheduling: Path Based Data Representation and Scheduling of SubgraphsJay Bharadwaj, Kishore N. Menezes, Chris McKinsey
Balance Scheduling: Weighting Branch Tradeoffs in SuperblocksAlexandre E. Eichenberger, Waleed Meleis
Optimizations and Oracle Parallelism with Dynamic TranslationKemal Ebcioglu, Erik R. Altman, Sumedh W. Sathaye, Michael Gschwind

MICRO 2000

Paper TitleAuthors
Eager Writeback - A Technique for Improving Bandwidth UtilizationHsien-Hsin S. Lee, Gary S. Tyson, Matthew K. Farrens
Silent Stores for FreeKevin M. Lepak, Mikko H. Lipasti
Predictor-Directed Stream BuffersTimothy Sherwood, Suleyman Sair, Brad Calder
On Pipelining Dynamic Instruction Scheduling LogicJared Stark, Mary D. Brown, Yale N. Patt
The Impact of Delay on the Design of Branch PredictorsDaniel A. Jiménez, Stephen W. Keckler, Calvin Lin
Improving BTB Performance in the Presence of DLLsStevan A. Vlaovic, Edward S. Davidson, Gary S. Tyson
Efficient Checker Processor DesignSaugata Chatterjee, Christopher T. Weaver, Todd M. Austin
An Integrated Approach to Accelerate Data and Predicate Computations in HyperblocksAlexandre E. Eichenberger, Waleed Meleis, Suman Maradani
Accurate and Efficient Predicate Analysis with Binary Decision DiagramsJohn W. Sias, Wen-mei W. Hwu, David I. August
Modulo Scheduling for a Fully-Distributed Clustered VLIW ArchitectureF. Jesús Sánchez, Antonio González
Two-Level Hierarchical Register File Organization for VLIW ProcessorsJavier Zalamea, Josep Llosa, Eduard Ayguadé, Mateo Valero
PipeRench Implementation of the Instruction Path CoprocessorYuan C. Chou, Pazhani Pillai, Herman Schmit, John Paul Shen
Efficient Conditional Operations for Data-Parallel ArchitecturesUjval J. Kapasi, William J. Dally, Scott Rixner, Peter R. Mattson, John D. Owens, Brucek Khailany
Flexible Hardware Acceleration for Multimedia Oriented MicroprocessorsFrederik Vermeulen, Lode Nachtergaele, Francky Catthoor, Diederik Verkest, Hugo De Man
Very Low Power Pipelines Using Significance CompressionRamon Canal, Antonio González, James E. Smith
A Static Power Model for ArchitectsJ. Adam Butts, Gurindar S. Sohi
A Framework for Dynamic Energy Efficiency and Temperature ManagementMichael C. Huang, Jose Renau, Seung-Moon Yoo, Josep Torrellas
Dynamic Zero Compression for Cache Energy ReductionLuis Villa, Michael Zhang, Krste Asanovic
Register Integration: A Simple and Efficient Implementation of Squash ReuseAmir Roth, Gurindar S. Sohi
The Store-Load Address Table and Speculative Register PromotionMatt Postiff, David A. Greene, Trevor N. Mudge
Memory Hierarchy Reconfiguration for Energy and Performance in General-Purpose Processor ArchitecturesRajeev Balasubramonian, David H. Albonesi, Alper Buyuktosunoglu, Sandhya Dwarkadas
Frequent Value Compression in Data CachesJun Yang, Youtao Zhang, Rajiv Gupta
A Study of Slipstream ProcessorsZachary Purser, Karthik Sundaramoorthy, Eric Rotenberg
Relational Profiling: Enabling Thread-Level Parallelism in Virtual MachinesTimothy H. Heil, James E. Smith
Calpa: A Tool for Automating Selective Dynamic CompilationMarkus Mock, Craig Chambers, Susan J. Eggers
Increasing the Size of Atomic Instruction Blocks Using Control Flow AssertionsSanjay J. Patel, Tony Tung, Satarupa Bose, Matthew M. Crum
Reducing Wire Delay Penalty Through Value PredictionJoan-Manuel Parcerisa, Antonio González
Compiler Controlled Value Prediction Using Branch Predictor Based ConfidenceEric Larson, Todd M. Austin
Instruction Distribution Heuristics for Quad-Cluster, Dynamically-Scheduled, Superscalar ProcessorsAmirali Baniasadi, Andreas Moshovos
Performance Improvement with Circuit-Level SpeculationTong Liu, Shih-Lien Lu

MICRO 2001

Paper TitleAuthors
Skipper: A Microarchitecture for Exploiting Control-Flow IndependenceChen-Yong Cher, T. N. Vijaykumar
Performance Characterization of a Hardware Mechanism for Dynamic OptimizationBrian Fahs, Satarupa Bose, Matthew Crum, Brian Slechta, Francesco Spadini, Tony Tung, Sanjay J. Patel, Steven S. Lumetta
Using Variable-MHz Microprocessors to Efficiently Handle Uncertainty in Real-Time SystemsEric Rotenberg
A Design Space Evaluation of Grid Processor ArchitecturesRamadass Nagarajan, Karthikeyan Sankaralingam, Doug Burger, Stephen W. Keckler
Reducing Set-Associative Cache Energy via Way-Prediction and Selective Direct-MappingMichael D. Powell, Amit Agarwal, T. N. Vijaykumar, Babak Falsafi, Kaushik Roy
A Code Decompression Architecture for VLIW ProcessorsYuan Xie, Wayne Wolf, Haris Lekatsas
Direct Load: Dependence-Linked Dataflow Resolution of Load Address and Cache CoordinateByung-Kwon Chung, Jinsuo Zhang, Jih-Kwon Peir, Shih-Chang Lai, Konrad Lai
Reducing Power Requirements of Instruction Scheduling Through Dynamic Allocation of Multiple Datapath ResourcesDmitry Ponomarev, Gurhan Kucuk, Kanad Ghose
Exploiting VLIW Schedule Slacks for Dynamic and Leakage Energy ReductionWensheng Zhang, Vijaykrishnan Narayanan, Mahmut Kandemir, Mary Jane Irwin, David Duarte, Yuh-Fang Tsai
Reducing Power with Dynamic Critical Path InformationJohn S. Seng, Eric S. Tune, Dean M. Tullsen
Direct Addressed Caches for Reduced Power ConsumptionEmmett Witchel, Sam Larsen, C. Scott Ananian, Krste Asanović
Modulo Schedule BuffersMatthew C. Merten, Wen-mei W. Hwu
Graph-Partitioning Based Instruction Scheduling for Clustered ProcessorsAlex Aletà, Josep M. Codina, Jesús Sánchez, Antonio González
Modulo Scheduling with Integrated Register Spilling for Clustered VLIW ArchitecturesJavier Zalamea, Josep Llosa, Eduard Ayguadé, Mateo Valero
Efficient Static Single Assignment Form for PredicationArthur Stoutchinin, Francois de Ferriere
The Impact of If-Conversion and Branch Prediction on Program Execution on the Intel® Itanium™ ProcessorYoungsoo Choi, Allan Knies, Luke Gerke, Tin-Fook Ngai
Mapping Reference Code to Irregular DSPs Within the Retargetable, Optimizing Compiler COGEN(T)Gary William Gréwal, Charles Thomas Wilson
Select-Free Instruction Scheduling LogicMary D. Brown, Jared Stark, Yale N. Patt
Dual Use of Superscalar Datapath for Transient-Fault Detection and RecoveryJoydeep Ray, James C. Hoe, Babak Falsafi
A High-Speed Dynamic Instruction Scheduling Scheme for Superscalar ProcessorsMasahiro Goshima, Kengo Nishino, Toshiaki Kitamura, Yasuhiko Nakashima, Shinji Tomita, Shin-ichiro Mori
Reducing the Complexity of the Register File in Dynamic Superscalar ProcessorsRajeev Balasubramonian, Sandhya Dwarkadas, David H. Albonesi
Saving Energy with Architectural and Frequency Adaptations for Multimedia ApplicationsChristopher J. Hughes, Jayanth Srinivasan, Sarita V. Adve
Enhancing Loop Buffering of Media and Telecommunications Applications Using Low-Overhead PredicationJohn W. Sias, Hillery C. Hunter, Wen-mei W. Hwu
Cool-Cache for Hot MultimediaOsman S. Unsal, Raksit Ashok, Israel Koren, C. Mani Krishna, Csaba Andras Moritz
ZR: A 3D API Transparent Technology for Chunk RenderingEmile Hsieh, Vladimir Pentkovski, Thomas Piazza
Dynamic Speculative PrecomputationJamison D. Collins, Dean M. Tullsen, Hong Wang, John P. Shen
Handling Long-Latency Loads in a Simultaneous Multithreading ProcessorDean M. Tullsen, Jeffery A. Brown
Correctly Implementing Value Prediction in Microprocessors That Support Multithreading or MultiprocessingMilo M. K. Martin, Daniel J. Sorin, Harold W. Cain, Mark D. Hill, Mikko H. Lipasti

MICRO 2002

Paper TitleAuthors
Vacuum Packing - Extracting Hardware-Detected Program Phases for Post-Link OptimizationRonald D. Barnes, Erik M. Nystrom, Matthew C. Merten, Wen-mei W. Hwu
Power Protocol - Reducing Power Dissipation on Off-Chip Data BusesK. Basu, Alok N. Choudhary, Jayaprakash Pisharath, Mahmut T. Kandemir
Hierarchical Scheduling WindowsEdward Brekelbaum, Jeff Rupley, Chris Wilkerson, Bryan Black
Characterizing and Predicting Value Degree of UseJ. Adam Butts, Gurindar S. Sohi
Microarchitectural Support for Precomputation MicrothreadsRobert S. Chappell, Francis Tseng, Adi Yoaz, Yale N. Patt
Pointer Cache Assisted PrefetchingJamison D. Collins, Suleyman Sair, Brad Calder, Dean M. Tullsen
Three-Dimensional Memory Vectorization for High Bandwidth Media Memory SystemsJesús Corbal, Roger Espasa, Mateo Valero
DELI - A New Run-Time Control PointGiuseppe Desoli, Nikolay Mateev, Evelyn Duesterwald, Paolo Faraboschi, Joseph A. Fisher
Managing Static Leakage Energy in Microprocessor Functional UnitsSteve Dropsho, Volkan Kursun, David H. Albonesi, Sandhya Dwarkadas, Eby G. Friedman
A Faster Optimal Register AllocatorChangqing Fu, Kent D. Wilken
Effective Instruction Scheduling Techniques for an Interleaved Cache Clustered VLIW ProcessorEnric Gibert, F. Jesús Sánchez, Antonio González
Microarchitectural Denial of Service - Insuring Microarchitectural FairnessDirk Grunwald, Soraya Ghiasi
Dynamic Addressing Memory Arrays with Physical LocalitySteven Hsu, Shih-Lien Lu, Shih-Chang Lai, Ram Krishnamurthy, Konrad Lai
Generating Physical Addresses Directly for Saving Instruction TLB EnergyIsmail Kadayif, Anand Sivasubramaniam, Mahmut T. Kandemir, Gokul B. Kandiraju, Guangyu Chen
Drowsy Instruction Caches - Leakage Power Reduction Using Dynamic Voltage Scaling and Cache Sub-Bank PredictionNam Sung Kim, Krisztián Flautner, David T. Blaauw, Trevor N. Mudge
Vector vs. Superscalar and VLIW Architectures for Embedded Multimedia BenchmarksChristoforos E. Kozyrakis, David A. Patterson
Compiling for Instruction Cache Performance on a Multithreaded ArchitectureRakesh Kumar, Dean M. Tullsen
Convergent SchedulingWalter Lee, Diego Puppin, Shane Swenson, Saman P. Amarasinghe
Reduced Code Size Modulo Scheduling in the Absence of Hardware SupportJosep Llosa, Stefan M. Freudenberger
Exploiting Data-Width Locality to Increase Superscalar Execution HandwidthGabriel H. Loh
Cherry - Checkpointed Early Resource Recycling in Out-of-Order MicroprocessorsJosé F. Martínez, Jose Renau, Michael C. Huang, Milos Prvulovic, Josep Torrellas
Instruction Fetch Deferral Using Static SlackGregory A. Muthler, David Crowe, Sanjay J. Patel, Steven Lumetta
Reducing Register Ports for Higher Speed and Lower EnergyIl Park, Michael D. Powell, T. N. Vijaykumar
Three Extensions to Register IntegrationVlad Petric, Anne Bracy, Amir Roth
Fetching Instruction StreamsAlex Ramírez, Oliverio J. Santana, Josep-Lluís Larriba-Pey, Mateo Valero
A Quantitative Framework for Automated Pre-Execution Thread SelectionAmir Roth, Gurindar S. Sohi
Dynamic Frequency and Voltage Control for a Multiple Clock Domain MicroarchitectureGreg Semeraro, David H. Albonesi, Steve Dropsho, Grigorios Magklis, Sandhya Dwarkadas, Michael L. Scott
Register Write Specialization Register Read Specialization - A Path to Complexity-Effective Wide-Issue Superscalar ProcessorsAndré Seznec, Eric Toullec, Olivier Rochecouste
Optimizing Pipelines for Power and PerformanceViji Srinivasan, David M. Brooks, Michael Gschwind, Pradip Bose, Victor V. Zyuban, Philip N. Strenski, Philip G. Emma
Using Modern Graphics Architectures for General-Purpose Computing - A Framework and AnalysisChris J. Thompson, Sahngyun Hahn, Mark Oskin
Microarchitectural Exploration with LibertyManish Vachharajani, Neil Vachharajani, David A. Penry, Jason A. Blome, David I. August
Orion - A Power-Performance Simulator for Interconnection NetworksHangsheng Wang, Xinping Zhu, Li-Shiuan Peh, Sharad Malik
Compiler Managed Micro-Cache Bypassing for High Performance EPIC ProcessorsYoufeng Wu, Ryan N. Rakvic, Li-Ling Chen, Chyi-Chang Miao, George Chrysos, Jesse Fang
Energy Efficient Frequent Value Data Cache DesignJun Yang, Rajiv Gupta
Compiler-Directed Instruction Cache Leakage OptimizationWei Zhang, Jie S. Hu, Vijay Degalahal, Mahmut T. Kandemir, Narayanan Vijaykrishnan, Mary Jane Irwin
Master/Slave Speculative ParallelizationCraig B. Zilles, Gurindar S. Sohi

MICRO 2003

Paper TitleAuthors
Razor: A Low-Power Pipeline Based on Circuit-Level Timing SpeculationDan Ernst, Nam Sung Kim, Shidhartha Das, Sanjay Pant, Rajeev R. Rao, Toan Pham, Conrad H. Ziesler, David T. Blaauw, Todd M. Austin, Krisztián Flautner, Trevor N. Mudge
VSV: L2-Miss-Driven Variable Supply-Voltage Scaling for Low PowerHai Li, Chen-Yong Cher, T. N. Vijaykumar, Kaushik Roy
A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance MicroprocessorShubhendu S. Mukherjee, Christopher T. Weaver, Joel S. Emer, Steven K. Reinhardt, Todd M. Austin
TLC: Transmission Line CachesBradford M. Beckmann, David A. Wood
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache ArchitecturesZeshan Chishti, Michael D. Powell, T. N. Vijaykumar
Near-Optimal Precharging in High-Performance Nanoscale CMOS CachesSe-Hyun Yang, Babak Falsafi
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power ReductionRakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy Ranganathan, Dean M. Tullsen
Runtime Power Monitoring in High-End Processors: Methodology and Empirical DataCanturk Isci, Margaret Martonosi
Power-Driven Design of Router Microarchitectures in On-Chip NetworksHangsheng Wang, Li-Shiuan Peh, Sharad Malik
Optimum Power/Performance Pipeline DepthAllan Hartstein, Thomas R. Puzak
Processor Acceleration Through Automated Instruction Set CustomizationNathan Clark, Hongtao Zhong, Scott A. Mahlke
The Reconfigurable Streaming Vector Processor (RSVPTM)Silviu M. S. A. Chiricescu, Ray Essick, Brian Lucas, Phil May, Kent Moat, Jim Norris, Michael A. Schuette, Ali Saidi
Scaling and Characterizing Database Workloads: Bridging the Gap Between Research and PracticeRichard A. Hankins, Trung A. Diep, Murali Annavaram, Brian Hirano, Harald Eri, Hubert Nueckel, John Paul Shen
Generational Cache Management of Code Traces in Dynamic Optimization SystemsKim M. Hazelwood, Michael D. Smith
The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization SystemJiwei Lu, Howard Chen, Rao Fu, Wei-Chung Hsu, Bobbie Othmer, Pen-Chung Yew, Dong-yuan Chen
IA-32 Execution Layer: A Two-Phase Dynamic Translator Designed to Support IA-32 Applications on Itanium-Based SystemsLeonid Baraz, Tevi Devor, Orna Etzion, Shalom Goldenberg, Alex Skaletsky, Yun Wang, Yigel Zemach
LLVA: A Low-level Virtual Instruction Set ArchitectureVikram S. Adve, Chris Lattner, Michael Brukman, Anand Shukla, Brian Gaeke
Comparing Program Phase Detection TechniquesAshutosh S. Dhodapkar, James E. Smith
Using Interaction Costs for Microarchitectural Bottleneck AnalysisBrian A. Fields, Rastislav Bodík, Mark D. Hill, Chris J. Newburn
Fast Path-Based Neural Branch PredictionDaniel A. Jiménez
Hardware Support for Control Transfers in Code CachesHo-Seop Kim, James E. Smith
Exploiting Value Locality in Physical Register FilesSaisanthosh Balakrishnan, Gurindar S. Sohi
Macro-Op Scheduling: Relaxing Scheduling Loop ConstraintsIlhyun Kim, Mikko H. Lipasti
WaveScalarSteven Swanson, Ken Michelson, Andrew Schwerin, Mark Oskin
Universal Mechanisms for Data-Parallel ArchitecturesKarthikeyan Sankaralingam, Stephen W. Keckler, William R. Mark, Doug Burger
Flexible Compiler-Managed L0 Buffers for Clustered VLIW ProcessorsEnric Gibert, F. Jesús Sánchez, Antonio González
Instruction Replication for Clustered MicroarchitecturesAlex Aletà, Josep M. Codina, Antonio González, David R. Kaeli
Efficient Memory Integrity Verification and Encryption for Secure ProcessorsG. Edward Suh, Dwaine E. Clarke, Blaise Gassend, Marten van Dijk, Srinivas Devadas
Fast Secure Processor for Inhibiting Software Piracy and TamperingJun Yang, Youtao Zhang, Lan Gao
IPStash: A Power-Efficient Memory Architecture for IP-LookupStefanos Kaxiras, Georgios Keramidas
Design and Implementation of High-Performance Memory Systems for Future Packet BuffersJorge García-Vidal, Jesús Corbal, Llorenç Cerdà, Mateo Valero
Beating In-Order Stalls with "Flea-Flicker" Two-Pass PipeliningRonald D. Barnes, Erik M. Nystrom, John W. Sias, Sanjay J. Patel, Nacho Navarro, Wen-mei W. Hwu
Scalable Hardware Memory Disambiguation for High ILP ProcessorsSimha Sethumadhavan, Rajagopalan Desikan, Doug Burger, Charles R. Moore, Stephen W. Keckler
Reducing Design Complexity of the Load/Store QueueIl Park, Chong-liang Ooi, T. N. Vijaykumar
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window ProcessorsHaitham Akkary, Ravi Rajwar, Srikanth T. Srinivasan