Lightning Session Presentation Order

  1. FPB: Fine-grained Power Budgeting to Improve Write Throughput of Multi-level Cell Phase Change Memory
  2. Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access
  3. Transactional Memory Architecture and Implementation for IBM System z
  4. Warped-DMR: Light-weight Error Detection for GPGPU
  5. The Performance Vulnerability of Architectural and Non-architectural Arrays to Permanent Faults
  6. NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip Architectures
  7. Cache-Conscious Wavefront Scheduling
  8. Libra: Tailoring SIMD Execution using Heterogeneous Hardware and Dynamic Configurability
  9. Unifying Primary Cache, Scratch, and Register File Memories in a Throughput Processor
  10. Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation
  11. KnightShift: Scaling the Energy Proportionality Wall Through Server-level Heterogeneity
  12. Rethinking DRAM Powermodes for Energy Proportionality
  13. CoScale: Coordinating CPU and Memory System DVFS in Server Systems
  14. Predicting Performance Impact of DVFS for Realistic Memory Systems
  15. Vector Extensions for Decision Support DBMS Acceleration
  16. NOC-Out: Microarchitecting a Scale-Out Processor
  17. SLICC: Self-Assembly of Instruction Cache Collectives for OLTP Workloads
  18. Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks
  19. AUDIT: Stress Testing the Automatic Way
  20. Accurate Fine-Grained Processor Power Proxies
  21. Fundamental Latency Trade-offs in Architecting DRAM Caches
  22. A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch
  23. CoLT: Coalesced Large-Reach TLBs
  24. NoRD: Node-Router Decoupling for Effective Power-gating of On-Chip Routers
  25. Dynamic Reconfiguration of 3D Photonic On-chip Interconnects for Maximizing Performance and Improving Fault Tolerance
  26. Addressing End-to-End Memory Access Latency in NoC-Based Multicores
  27. MorphCore: An Energy-Efficient Microarchitecture for High Performance ILP and High Throughput TLP
  28. Composite Cores: Pushing Heterogeneity into a Core
  29. Control-Flow Decoupling
  30. Spatiotemporal Coherence Tracking
  31. Predicting Coherence Communication by Tracking Synchronization Points at Run Time
  32. Vulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically
  33. Amoeba-Cache: Adaptive Blocks for Eliminating Waste in the Memory Hierarchy
  34. Improving Cache Management Policies Using Dynamic Reuse Distances
  35. Kernel Partitioning of Streaming Applications: A Statistical Approach to an NP-complete Problem
  36. Inferred Models for Dynamic and Sparse Hardware-Software Spaces
  37. SMARTQ: Software-Managed Alias Register Queue for Dynamic Optimizations
  38. Profiling Data-Dependence to Assist Parallelization: Framework, Scope, and Optimization
  39. Neural Acceleration for General-Purpose Approximate Programs
  40. Designing a Programmable Wire-Speed Regular-Expression Matching Accelerator