Fault-Injection in FPGAsTo inject, detect and fix faults in the configuration memory of the FPGA. Keywords: Safety, Fault injection, Fault insertion, Failure injection, Safety validation, Safety verification, Fault modelling, Fault handling, Fault propagation, Soft-error mitigationhttps://repo.valu3s.eu/method/fault-injection-in-fpgashttps://repo.valu3s.eu/@@site-logo/logo_valu3s_green_transparent.png
Fault-Injection in FPGAs
To inject, detect and fix faults in the configuration memory of the FPGA. Keywords: Safety, Fault injection, Fault insertion, Failure injection, Safety validation, Safety verification, Fault modelling, Fault handling, Fault propagation, Soft-error mitigation
To explore and evaluate the results of faultinjection in an FPGA-based Hardware Platform and its propagation to other system layers. To relate the fault injection on the low-level hardware layer to potential faults on the higher layers to reduce test space.
The Configuration RAM (CRAM) of FPGAs are susceptible to Single-Event Upsets. If a bit in the CRAM is flipped, the FPGA's functionality changes. To counteract this, the CRAM must be continually scrubbed using a Soft-Error Mitigation core, to flip changed bits back to their original design state. The same methodology can be used to intentionally flip bits to inject errors in the CRAM.
A Healing Core IP (HC IP) is a means that can inject, detect and fix faults at any desired place in an FPGA Design. It uses Soft-Error Mitigation (SEM)-cores to detect errors and uses Run-Time Reconfiguration (RTR) techniques to correct Single-and Multiple-Event Upsets (bit-flips) in the FPGA’s configuration memory. Further, it has a classification system that can report and initiate appropriate countermeasures for some faults. Thus, it can be used to implement self-repairing functionality in an FPGA system. The HC IP can be used to inject, detect and fix faults in the HW designs resident in the FPGA. By flipping bits in the configuration RAM of the FPGA, we can see how the system reacts, how the fault manifests on the system level, and how it affects up-time, robustness and availability of the component, and evaluate what it implies according to relevant safety standards. Based on historical data, the V&V process should minimize the steps needed for (re-)certification.
Fault injection in FPGAs allows to:
Inject, Detect and Heal Faults caused by Single-Event Upsets in the FPGA fabric. This will allow to assess the effects of Single-Event Upsets in the Hardware Platform and see how it will manifest itself, both in Software and on System Level.
Inject, Detect and Heal Faults during Run-Time. This will be useful in V&V, when assessing the Safety of the FPGA System.
Fault injection in FPGAs paired with the Healing Core has the potential to:
Enable usage of FPGAs in Safety Critical Functions in a System. This will lead to reduced Time, Cost, and Efforts when developing Safe Hardware Platforms.
The method has the following limitations:
Using the Healing Core has some Time, Cost and Effort overhead associated with it. At the moment it is unclear if the costs of using it outweigh the costs of not using it.
The Healing Core is also subject to Faults. If the voter that reads or write the ICAP is hit, we cannot detect and correct the Fault. The FPGA has to be reset, resulting in system downtime.
There may be faults in the FPGA that cannot be healed without resetting and rebooting the entire FPGA, resulting in system downtime.
[FIF1] E. Kyriakakis, K. Ngo, J. Öberg, “Mitigating Single-Event Upsets in COTS SDRAM using an EDAC SDRAM Controller”, In Proc. of 2017 IEEE Nordic Circuits and Systems Conference (NorCAS-2017), Linköping, Sweden, Oct 24-25, 2017.
[FIF2] E. Kyriakakis, K. Ngo, J. Öberg, “Implementation of a Fault-Tolerant, Globally-Asynchronous-Locally-Synchronous, Inter-Chip NoC Communication Bridge on FPGAs”, In Proc. of 2017 IEEE Nordic Circuits and Systems Conference (NorCAS-2017), Linköping, Sweden, Oct 24-25, 2017.
[FIF3] K. Ngo, T. Mohammadat, J. Öberg, “Towards a Single Event Upset Detector Based on COTS FPGA”, In Proc. of 2017 IEEE Nordic Circuits and Systems Conference (NorCAS-2017), Linköping, Sweden, Oct 24-25, 2017.
[FIF4] Öberg, J., Robino, F., “A NoC System Generator for the Sea-of-Cores Era”, In Proc. of FPGAWorld 2011, Copenhagen, Stockholm, Munich, September, 2011, ACM Digital Libraries.