Self-stabilizing microprocessor: Analyzing and overcoming soft errors

Shlomi Dolev, Yinnon A. Haviv

Research output: Contribution to journalArticlepeer-review

31 Scopus citations

Abstract

Soft errors are changes in memory value caused by external radiation or electrical noise. Decreases in computing feature sizes and power usages and shorting the microcycle period enhance the influence of soft errors. Self-stabilizing systems are designed to be started in an arbitrary, possibly a corrupted, state due to, say, soft errors, and to converge to a desired behavior. Self-stabilization is defined by the state space of the components and is essentially a well-founded, clearly defined form of the terms self-healing, automatic-recovery, automatic-repair, and autonomic-computing. To implement a self-stabilizing system, one needs to ensure that the microprocessor that executes the program is self-stabilizing. A self-stabilizing microprocessor copes with any combination of soft errors, converging to perform fetch-decode-execute in fault-free periods. Still, it is important that the microprocessor will avoid convergence periods if possible by masking the effect of soft errors immediately. In this work, we present design schemes for a self-stabilizing microprocessor and a new technique for analyzing the effect of soft errors. Previous schemes for analyzing the effect of soft errors were based on simulations. In contrast, our scheme computes a lower bound on microprocessor reliability and enables the microprocessor designer to evaluate the reliability of the design and to identify reliability bottlenecks. When analyzing the resiliency of digital circuits to soft errors, we examine the logical masking, i.e., errors in internal nodes of the circuits that are masked later by the computation. We show that the problem of computing the reliability of a circuit such that logical masking is taken into account is an NP-hard problem.

Original languageEnglish
Pages (from-to)385-399
Number of pages15
JournalIEEE Transactions on Computers
Volume55
Issue number4
DOIs
StatePublished - 1 Apr 2006

Keywords

  • Microprocessor
  • Self-stabilization
  • Single event upset
  • Soft errors

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Self-stabilizing microprocessor: Analyzing and overcoming soft errors'. Together they form a unique fingerprint.

Cite this