Reliability in computing devices is of utmost importance. When devices are integrated into cyber physical systems like cars, airplanes and Unmanned Aerial Vehicles (UAVs), human lives are at risk, hence the need to ensure they work reliably. Leading technology provider Arm’s CPUs (Central Processing Units) are widely used in many portable user devices. Their large-scale presence is also significant, from the largest supercomputer, Fugaku in Japan to the laboratories of the US Department of Energy. More recently, Arm has been working on a line of products for autonomous vehicles.
The soft error reliability of microprocessors can be estimated pre-silicon using early design models and post-silicon by accelerated beam testing on manufactured chips. In this study, the team compared the FIT (Failures-In-Time) rates from beam experiments on actual chips to microarchitecture fault injection in early-stage CPU models. They used the Arm CPU cores, Cortex-A5 and Cortex-A9.The Cortext-A5 CPU is standalone, while the Cortex-A9 is embedded in a System on Chip (SoC).
Pre-silicon results were obtained using a framework which was configured to inject single-event transient faults during system simulation in multiple components, equating to more than 90% of SRAM cells inside the CPU core. At least 1000 single bit transient faults were injected on each of the target components, amounting to a total of 176,000 injections.
For post-silicon measurements, the ChipIr instrument at ISIS was used. ChipIr delivers a neutron beam that mimics the effect of the atmospheric neutrons in electronic devices, enabling measurement of device FIT. The available neutron flux is ~8 orders of magnitude higher than the terrestrial flux.
This study demonstrated that, under several different system setups, microarchitectural fault injection provides an accurate estimation of the data corruption FIT rate before implementing the device in silicon. The comparison between beam testing and fault injection is a significant step in the early error rate estimation. This is of interest for flexible architectures like Arm that can be tuned by the customer by adding hardened solutions before being implemented in silicon.