Skip to main content

Algorithm Level Fault Tolerance: a Technique to Cope with Long Duration Transient Faults in Matrix Multiplication Algorithms

Carlo Lisboa, Costas Argyrides, Dhiraj Pradhan, Luigi Carro, Algorithm Level Fault Tolerance: a Technique to Cope with Long Duration Transient Faults in Matrix Multiplication Algorithms. IEEE VLSI Test Symposium (VTS) 2008, . April 2008. No electronic version available.

Abstract

For technologies beyond the 45 nm node, radiation induced transients will last longer than one clock cycle. In this scenario, temporal redundancy techniques will no longer be able to cope with radiation induced soft errors, while spatial redundancy techniques still impose high power and area overheads. The solution to this impasse is the use of algorithm level techniques, able to detect and correct errors with low cost. In this paper, a new approach to deal with this problem is proposed, and applied to matrix multiplication algorithm. The proposed technique is compared to previously published fault tolerance techniques, and the costs of detection and recomputation for both approaches are compared and discussed.

Bibtex entry.

Contact details

Publication Admin