



# Use-it or Lose-it: Wearout and Lifetime in Future Chip-Multiprocessors



Hyungjun Kim,<sup>1</sup> Arseniy Vitkovsky,<sup>2</sup> Paul V. Gratz,<sup>1</sup> Vassos Soteriou<sup>2</sup>

<sup>1</sup> Department of Electrical and Computer Engineering, Texas A&M University

<sup>2</sup> Department of Electrical Engineering, Computer Engineering and Informatics, Cyprus University of Technology



## **Chip-multiprocessor Wearout**





# ITRS: Rates of wearout induced failure to increase 10X in 10 years

- HCI and NBTI: transistor slowdown with use

#### Wearout effects in CMPs:

#### Recoverable failures:

- 1) Core failure
  - Failure detection and remapping

#### Non-recoverable failures:

- 2) I/O device disconnection
  - Device unreachable
- 3) Network partition
  - Disruption of communication between cores
- 4) Individual link breakage
  - Deadlock potential

Interconnect critical point of failure



A 64-core Chip-Multiprocessor (CMP) with various peripherals interconnected via a 2-D Mesh, all failure scenarios illustrated

## Use it or Lose it





### Analysis of real CMP workloads:

- Low loads in interconnect
- NBTI causes critical path slowdown
- Lack of load leads to interconnect breakdown and failure

# The *Use it or Lose it,* wear-resistant router microarchitecture

- Increases utilization of router critical path
- 22x lifetime improvement!



Lifetime improvement of 8x8 CMP executing applications from the PARSEC benchmark suite

Session 2B (Alpha Gamma Rho Room) 4:00 PM today!