MICRO logo

Annual IEEE/ACM International Symposium on Microarchitecture®

MICRO Test of Time Award

List of Eligible Papers for the 2015 Award

View the 2015 call for nominations (closed).

MICRO 1993

Paper TitleAuthors
Efficient Scheduling of Fine Grain Parallelism in LoopsM. Rajagopalan, V. H. Allan
Employing Finite Automata for Resource SchedulingThomas Müller
GPMB—Software Pipelining Branch-Intensive LoopsZhihong Tang, Gang Chen, Chihong Zhang, Yingwei Zhang, Bogong Su, Stanley Habib
A Microarchitectural Performance Evaluation of a 3.2 Gbyte/S Microprocessor BusTim Stanley, Michael Upton, Patrick Sherhart, Trevor Mudge, Richard Brown
Two-Ported Cache Alternatives for Superscalar ProcessorsAndrew Wolfe, Rodney Boleyn
A Study On the Number of Memory Ports in Multiple Instruction Issue MachinesSoo-Mook Moon, Kemal Ebcioğlu
The 16-Fold Way: A Microparallel TaxonomyBarton J. Sano, Alvin M. Despain
A Comparative Performance Evaluation of Various State Maintenance MechanismsMichael Butler, Yale Patt
Dynamically Scheduled VLIW ProcessorsB. Ramakrishna Rau
Prophetic Branches: A Branch Architecture for Code Compaction and Efficient ExecutionApoorv Srivastava, Alvin M. Despain
A Comparision of Superscalar and Decoupled Access/Execute ArchitecturesMatthew K. Farrens, Pius Ng, Phil Nico
Measuring Limits of Parallelism and Characterizing Its Vulnerability to Resource ConstraintsLawrence Rauchwerger, Pradeep K. Dubey, Ravi Nair
An Evaluation of Bottom-Up and Top-Down Thread Generation TechniquesA. P. W. Böhm, W. A. Najjar, B. Shankar, L. Roh
Techniques for Extracting Instruction Level Parallelism On MIMD ArchitecturesGary Tyson, Matthew Farrens
Predictability of Load/Store Instruction LatenciesSantosh G. Abraham, Rabin A. Sugumar, Daniel Windheiser, B. R. Rau, Rajiv Gupta
Control Flow Prediction for Dynamic ILP ProcessorsDionisios N. Pnevmatikatos, Manoj Franklin, Gurindar S. Sohi
Branch History Table Indexing to Prevent Pipeline Bubbles in Wide-Issue Superscalar ProcessorsTse-Yu Yeh, Yale N. Patt
Clocked and Asynchronous Instruction PipelinesMark A. Franklin, Tienyo Pan
An Analysis of Dynamic Scheduling Techniques for Symbolic ApplicationsAlessandra Costa, Alessandro De Gloria, Paolo Faraboschi, Mauro Olivieri
MIDEE: Smoothing Branch and Instruction Cache Miss Penalties On Deep PipelinesNathalie Drach, André Seznec
Register Renaming and Dynamic Speculation: An Alternative ApproachMayan Moudgill, Keshav Pingali, Stamatis Vassiliadis
Speculative Execution Exception Recovery Using Write-Back SuppressionRoger A. Bringmann, Scott A. Mahlke, Richard E. Hank, John C. Gyllenhaal, Wen-mei W. Hwu
EXPLORER: A Retargetable and Visualization-Based Trace-Driven Simulator for Superscalar ProcessorsTrung A. Diep, John P. Shen, Mike Phillip
An Extended Classification of Inter-Instruction Dependency and Its Application in Automatic Synthesis of Pipelined ProcessorsIng-Jer Huang, Alvin M. Despain
Superblock Formation Using Static Program AnalysisRichard E. Hank, Scott A. Mahlke, Roger A. Bringmann, John C. Gyllenhaal, Wen-mei W. Hwu
Instruction Scheduling for the Motorola 88110Mark Smotherman, Shuchi Chawla, Stan Cox, Brian Malloy
A VLIW Architecture Based On Shifting Register FilesH. Fatih Uğurdağ, Christos A. Papachristou

MICRO 1994

Paper TitleAuthors
Static Branch Frequency and Program Profile AnalysisYoufeng Wu, James R. Larus
Using Branch Handling Hardware to Support Profile-Driven OptimizationThomas M. Conte, Burzin A. Patel, J. Stan Cox
Branch Classification: A New Mechanism for Improving Branch Predictor PerformancePo-Yung Chang, Eric Hao, Tse-Yu Yeh, Yale Patt
Techniques for Compressing Program Address TracesAndrew R. Pleszkun
Height Reduction of Control Recurrences for ILP ProcessorsMichael Schlansker, Vinod Kathail, Sadun Anik
Theoretical Modeling of Superscalar Processor PerformanceDerek B. Noonburg, John P. Shen
Iterative Modulo Scheduling: An Algorithm for Software Pipelining LoopsB. Ramakrishna Rau
Minimum Register Requirements for a Modulo ScheduleAlexandre E. Eichenberger, Edward S. Davidson, Santosh G. Abraham
Minimizing Register Requirements Under Resource-Constrained Rate-Optimal Software PipeliningR. Govindarajan, Erik R. Altman, Guang R. Gao
Software Pipelining with Register Allocation and SpillingJian Wang, Andreas Krall, M. Anton Ertl, Christine Eisenbeis
Reducing Memory Traffic with CRegsPeter Dahl, Matthew O'Keefe
Dynamic Memory Disambiguation for Array ReferencesDavid Bernstein, Doron Cohen, Dror E. Maydan
A Study of Pointer Aliasing for Software Pipelining Using Run-Time DisambiguationBogong Su, Stanley Habib, Wei Zhao, Jian Wang, Youfeng Wu
Data Relocation and Prefetching for Programs with Large Data SetsYoji Yamada, John Gyllenhall, Grant Haab, Wen-mei Hwu
Cache Designs with Partial Address MatchingLishing Liu
Minimizing Branch Misprediction Penalties for Superpipelined ProcessorsChing-Long Su, Alvin M. Despain
Facilitating Superscalar Processing Via a Combined Static/Dynamic Register Renaming SchemeEric Sprangle, Yale Patt
Improving Resource Utilization of the MIPS R8000 Via Post-Scheduling Global Instruction DistributionRaymond Lo, Sun Chan, Fred Chow, Shin-Ming Liu
A Comparison of Two Pipeline OrganizationsMichael Golden, Trevor Mudge
A Fill-Unit Approach to Multiple Instruction IssueManoj Franklin, Mark Smotherman
A High-Performance Microarchitecture with Hardware-Programmable Functional UnitsRahul Razdan, Michael D. Smith
The Anatomy of the Register File in a Multiscalar ProcessorScott E. Breach, T. N. Vijaykumar, Gurindar S. Sohi
Register File Port Requirements of Transport Triggered ArchitecturesJan Hoogerbrugge, Henk Corporaal
The Effects of Predicated Execution On Branch PredictionGary Scott Tyson
Analysis of the Conditional Skip Instructions of the HP Precision ArchitectureJonathan P. Vogel, Bruce K. Holmer
Characterizing the Impact of Predicated Execution On Branch PredictionScott A. Mahlke, Richard E. Hank, Roger A. Bringmann, John C. Gyllenhaal, David M. Gallagher, Wen-mei W. Hwu
The Effect of Speculatively Updating Branch History On Branch Prediction Accuracy, RevisitedEric Hao, Po-Yung Chang, Yale N. Patt

MICRO 1995

Paper TitleAuthors
Performance Issues in Correlated Branch Prediction SchemesNicolas Gloy, Michael D. Smith, Cliff Young
Dynamic Path-Based Branch CorrelationRavi Nair
The Predictability of Branches in LibrariesBrad Calder, Dirk Grunwald, Amitabh Srivastava
The Performance Impact of Incomplete Bypassing in Processor PipelinesPritpal S. Ahuja, Douglas W. Clark, Anne Rogers
Efficient Instruction Scheduling Using Finite State AutomataVasanth Bala, Norman Rubin
Critical Path Reduction for Scalar ProgramsMichael Schlansker, Vinod Kathail
A Limit Study of Local Memory Requirements Using Value Reuse ProfilesAndrew S. Huang, John P. Shen
Zero-Cycle Loads: Microarchitecture Support for Reducing Load LatencyTodd M. Austin, Gurindar S. Sohi
A Modified Approach to Data Cache ManagementGary Tyson, Matthew Farrens, John Matthews, Andrew R. Pleszkun
Petri Net Versus Modulo Scheduling for Software PipeliningVicki H. Allan, U. R. Shah, K. M. Reddy
Modulo Scheduling with Multiple Initiation IntervalsNancy J. Warter-Perez, Noubar Partamian
Spill-Free Parallel Scheduling of Basic BlocksB. Natarajan, M. Schlansker
Improving Instruction-Level Parallelism by Loop Unrolling and Dynamic Memory DisambiguationJack W. Davidson, Sanjay Jinturkar
Self-Regulation of Workload in the Manchester Data-Flow ComputerJohn R. Gurd, David F. Snelling
The M-Machine MulticomputerMarco Fillo, Stephen W. Keckler, William J. Dally, Nicholas P. Carter, Andrew Chang, Yevgeny Gurevich, Whay S. Lee
Region-Based Compilation: An Introduction and MotivationRichard E. Hank, Wen-Mei W. Hwu, B. Ramakrishna Rau
An Experimental Study of Several Cooperative Register Allocation and Instruction Scheduling StrategiesCindy Norris, Lori L. Pollock
Register Allocation for Predicated CodeAlexandre E. Eichenberger, Edward S. Davidson
Partial Resolution in Branch Target BuffersBarry Fagin, Kathryn Russell
A System Level Perspective On Branch Architecture PerformanceBrad Calder, Dirk Grunwald, Joel Emer
Dynamic Rescheduling: A Technique for Object Code Compatibility in VLIW ArchitecturesThomas M. Conte, Sumedh W. Sathaye
Improving CISC Instruction Decoding Performance Using a Fill UnitMark Smotherman, Manoj Franklin
SPAID: Software Prefetching in Pointer- and Call-Intensive EnvironmentsMikko H. Lipasti, William J. Schmidt, Steven R. Kunkel, Robert R. Roediger
An Effective Programmable Prefetch Engine for On-Chip CachesTien-Fu Chen
Cache Miss Heuristics and Preloading Techniques for General-Purpose ProgramsToshihiro Ozawa, Yasunori Kimura, Shin'ichiro Nishizaki
Alternative Implementations of Hybrid Branch PredictorsPo-Ying Chang, Eric Hao, Yale N. Patt
Control Flow Prediction with Tree-Like Subgraphs for Superscalar ProcessorsSimonjit Dutta, Manoj Franklin
The Role of Adaptivity in Two-Level Adaptive Branch PredictionStuart Sechrest, Chih-Chieh Lee, Trevor Mudge
Design of Storage Hierarchy in Multithreaded ArchitecturesLucas Roh, Walid A. Najjar
An Investigation of the Performance of Various Instruction-Issue Buffer TopologiesStéphan Jourdan, Pascal Sainrat, Daniel Litaize
Decoupling Integer Execution in Superscalar ProcessorsSubbarao Palacharla, J. E. Smith
Exploiting Short-Lived Variables in Superscalar ProcessorsLuis A. Lozano, Guang R. Gao
Partitioned Register File for TTAsJohan Janssen, Henk Corporaal
Disjoint Eager Execution: An Optimal Form of Speculative ExecutionAugustus K. Uht, Vijay Sindagi, Kelley Hall
Unrolling-Based Optimizations for Modulo SchedulingDaniel M. Lavery, Wen-Mei W. Hwu
Stage Scheduling: A Technique to Reduce the Register Requirements of a Modulo ScheduleAlexandre E. Eichenberger, Edward S. Davidson
Hypernode Reduction Modulo SchedulingJosep Llosa, Mateo Valero, Eduard Ayguadé, Antonio González

MICRO 1996

Paper TitleAuthors
A Persistent Rescheduled-Page Cache for Low Overhead Object Code Compatibility in VLIW ArchitecturesThomas M. Conte, Sumedh W. Sathaye, Sanjeev Banerjia
Integrating a Misprediction Recovery Cache (MRC) Into a Superscalar PipelineJames O. Bondi, Ashwini K. Nanda, Simonjit Dutta
Trace Cache: A Low Latency Approach to High Bandwidth Instruction FetchingEric Rotenberg, Steve Bennett, James E. Smith
Accurate and Practical Profile-Driven Compilation Using the Profile BufferThomas M. Conte, Kishore N. Menezes, Mary Ann Hirsch
Efficient Path ProfilingThomas Ball, James R. Larus
Profile-Driven Instruction Level Parallel Scheduling with Application to Super BlocksC. Chekuri, R. Johnson, R. Motwani, B. Natarajan, B. R. Rau, M. Schlansker
Speculative Hedge: Regulating Compile-Time Speculation Against Profile VariationsBrian L. Deitrich, Wen-mei W. Hwu
Hot Cold Optimization of Large Windows/NT ApplicationsRobert Cohn, P. Geoffrey Lowney
Java Bytecode to Native Code Translation: The Caffeine Prototype and Preliminary ResultsCheng-Hsueh A. Hsieh, John C. Gyllenhaal, Wen-mei W. Hwu
Analysis Techniques for Predicated CodeRichard Johnson, Michael Schlansker
Global Predicate Analysis and Its Application to Register AllocationDavid M. Gillies, Dz-ching Roy Ju, Richard Johnson, Michael Schlansker
Modulo Scheduling of Loops in Control-Intensive Non-Numeric ProgramsDaniel M. Lavery, Wen-mei W. Hwu
Assigning Confidence to Conditional Branch PredictionsErik Jacobsen, Eric Rotenberg, J. E. Smith
Compiler Synthesized Dynamic Branch PredictionScott Mahlke, Balas Natarajan
Wrong-Path Instruction PrefetchingJim Pierce, Trevor Mudge
Design Decisions Influencing the UltraSPARC's Instruction Fetch ArchitectureRobert Yung
Increasing the Instruction Fetch Rate Via Block-Structured Instruction Set ArchitecturesEric Hao, Po-Yung Chang, Marius Evers, Yale N. Patt
Instruction Fetch Mechanisms for VLIW Architectures with Compressed EncodingsThomas M. Conte, Sanjeev Banerjia, Sergei Y. Larin, Kishore N. Menezes, Sumedh W. Sathaye
Tango: A Hardware-Based Data Prefetching Technique for Superscalar ProcessorsShlomit S. Pinter, Adi Yoaz
Exceeding the Dataflow Limit Via Value PredictionMikko H. Lipasti, John Paul Shen
The Performance Potential of Data Dependence Speculation & CollapsingYiannakis Sazeides, Stamatis Vassiliadis, James E. Smith
Heuristics for Register-Constrained Software PipeliningJosep Llosa, Mateo Valero, Eduard Ayguadé
Software Pipelining Loops with Conditional BranchesMark G. Stoodley, Corinna G. Lee
Combining Loop Transformations Considering Caches and SchedulingMichael E. Wolf, Dror E. Maydan, Ding-Kai Chen
Instruction Scheduling and Executable EditingEric Schnarr, James R. Larus
Instruction Scheduling for the HP PA-8000David A. Dunn, Wei-Chung Hsu
Meld Scheduling: Relaxing Scheduling Constraints Across Region BoundariesSantosh G. Abraham, Vinod Kathail, Brian L. Deitrich
Custom-Fit Processors: Letting Applications Define ArchitecturesJoseph A. Fisher, Paolo Faraboschi, Giuseppe Desoli
Optimization for a Superscalar Out-of-Order MachineAnne M. Holler
Optimization of Machine Descriptions for Efficient UseJohn C. Gyllenhaal, Wen-mei W. Hwu, B. Ramabriohna Rau

MICRO 1997

Paper TitleAuthors
The Bi-Mode Branch PredictorChih-Chieh Lee, I-Cheng K. Chen, Trevor N. Mudge
Path-Based Next Trace PredictionQuinn Jacobson, Eric Rotenberg, James E. Smith
Alternative Fetch and Issue Policies for the Trace Cache Fetch MechanismDaniel Holmes Friendly, Sanjay Jeram Patel, Yale N. Patt
Reducing the Performance Impact of Instruction Cache Misses by Writing Instructions Into the Reservation Stations Out-of-OrderJared Stark, Paul Racunas, Yale N. Patt
On High-Bandwidth Data Cache Design for Multi-Issue ProcessorsJude A. Rivers, Gary S. Tyson, Edward S. Davidson, Todd M. Austin
Run-Time Spatial Locality Detection and OptimizationTeresa L. Johnson, Matthew C. Merten, Wen-Mei W. Hwu
A Comparison of Data Prefetching On an Access Decoupled and Superscalar MachineG. P. Jones, N. P. Topham
The Design and Performance of a Conflict-Avoiding CacheNigel Topham, Antonio González, José González
Prediction Caches for Superscalar ProcessorsJames E. Bennett, Michael J. Flynn
A Framework for Balancing Control Flow and PredicationDavid I. August, Wen-mei W. Hwu, Scott A. Mahlke
Evaluation of Scheduling Techniques On a SPARC-Based VLIW TestbedSeongbae Park, SangMin Shim, Soo-Mook Moon
Tuning Compiler Optimizations for Simultaneous MultithreadingJack L. Lo, Susan J. Eggers, Henry M. Levy, Sujay S. Parekh, Dean M. Tullsen
Exploiting Dead Value InformationMilo M. Martin, Amir Roth, Charles N. Fischer
Trace ProcessorsEric Rotenberg, Quinn Jacobson, Yiannakis Sazeides, Jim Smith
The Multicluster Architecture: Reducing Cycle Time Through PartitioningKeith I. Farkas, Paul Chow, Norman P. Jouppi, Zvonko Vranesic
Out-of-Order Vector ArchitecturesRoger Espasa, Mateo Valero, James E. Smith
Initial Results On the Performance and Cost of Vector MicroprocessorsCorinna G. Lee, Derek J. DeVries
The Filter Cache: An Energy Efficient Memory StructureJohnson Kin, Munish Gupta, William H. Mangione-Smith
Improving Code Density Using Compression TechniquesCharles Lefurgy, Peter Bird, I-Cheng Chen, Trevor Mudge
Procedure Based Program CompressionDarko Kirovski, Johnson Kin, William H. Mangione-Smith
Improving the Accuracy and Performance of Memory Communication Through RenamingGary S. Tyson, Todd M. Austin
Microarchitecture Support for Improving the Performance of Load Target PredictionChung-Ho Chen, Akida Wu
Streamlining Inter-Operation Memory Communication Via Data Dependence PredictionAndreas Moshovos, Gurindar S. Sohi
The Predictability of Data ValuesYiannakis Sazeides, James E. Smith
Value ProfilingBrad Calder, Peter Feller, Alan Eustace
Can Program Profiling Support Value Prediction?Freddy Gabbay, Avi Mendelson
Highly Accurate Data Value Prediction Using Hybrid PredictorsKai Wang, Manoj Franklin
ProfileMe: Hardware Support for Instruction-Level Profiling On Out-of-Order ProcessorsJeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, George Chrysos
Procedure Placement Using Temporal Ordering InformationNikolas Gloy, Trevor Blackwell, Michael D. Smith, Brad Calder
Predicting Data Cache Misses in Non-Numeric Applications Through Correlation ProfilingTodd C. Mowry, Chi-Keung Luk
Available Paralellism in Video ApplicationsHeng Liao, Andrew Wolfe
MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communicatons SystemsChunho Lee, Miodrag Potkonjak, William H. Mangione-Smith
Cache Sensitive Modulo SchedulingF. Jesús Sánchez, Antonio González
Unroll-and-Jam Using Uniformly Generated SetsSteve Carr, Yiping Guan
Resource-Sensitive Profile-Directed Data Flow Analysis for Code OptimizationRajiv Gupta, David A. Berson, Jesse Z. Fang