V&V of machine learning-based systems using simulators
Machine learning, in particular deep learning, is a critical enabling technology for many of the highly automated applications today. Typical examples include intelligent transport systems (ITS) where ML solutions are used to extract a digital representation of the traffic context from the highly dimensional sensor inputs. Unfortunately, the ML models are opaque in nature (stochastic and data driven with limited output interpretability), while functional safety requirements are strict and require a corresponding safety case [VVM1]. Furthermore, development of systems that rely on deep learning introduces new types of faults [VVM2]. To meet the increasing needs of trusted ML-based solutions [VVM3], numerous V&V approaches have been proposed.
Simulators can be used to support system testing as part of V&V of SCP requirements. An ideal simulator to test perception, planning and decision-making components of an autonomous system must realistically simulate the environment, sensors and their interaction with the environment through actuators. Simulated environments bring several benefits to V&V of ML-based systems, particularly when
- Data collection or data annotation is difficult, costly or time consuming
- Real-world testing is endangering human safety
- Coverage of collected data is limited
- Reproducible and scalability are important
The major bulk of system-level testing of autonomous features in the automotive industry is carried out through on-road testing or using naturalistic field operational tests. These activities, however, are expensive, dangerous, and ineffective [VVM4]. A feasible and efficient alternative is to conduct system-level testing through computer simulations that can capture the entire self-driving vehicle and its operational environment using effective and high-fidelity physics-based simulators. There is a growing number of public-domain and commercial simulators that have been developed over the past few years to support realistic simulation of self-driving systems, e.g., TASS/Siemens PreScan, ESI Pro-SiVIC, CARLA, LGSVL, SUMO, AirSim, and BeamNG. Simulators will play an important role in the future of automotive V&V, as simulation is recognized as one of the main techniques in ISO/PAS 21448.
As the possible input space when testing automotive systems is practically infinite, attempts to design test cases for comprehensive testing over the space of all possible simulation scenarios are futile. Hence, search-based software testing has been advocated as an effective and efficient strategy to generate test scenarios in simulators [VVM5, VVM6]. Another line of research proposes techniques to generate test oracles, i.e., mechanisms for determining whether a test case has passed or failed [VVM7]. Related to the oracle problem, several authors proposed using metamorphic testing of ML-based perception systems [VVM8, VVM9], i.e., executing transformed test cases while expecting the same output. Such transformations are suitable to test in simulated environments, e.g., applying filters on camera input or modifying images using generative adversarial networks.
- Cost efficient: Using simulation for V&V of automotive systems reduces the cost of using a real track and actual vehicles and instruments that could risk damage during the testing process.
- Time: Having an immediate response from a simulator shortens the software development cycle, i.e., it enables quicker feedback.
- Safety: Currently, testing many vehicle collisions and accident scenarios are done using safe dedicated test and assessment protocols, however, testing an incomplete system always exposes the testers to unpredictable dangers. Using simulators, the risks of test driving of an autonomous vehicle in urban areas will be substantially reduced.
- Edge cases: Many low probability safety critical situations and hazards that would not be encountered on a test track can be generated in simulated environments.
- The gap between simulation and reality
- Uncertainty of machine learning systems
- [VVM1] M. Borg, C. Englund, K. Wnuk, B. Duran, C. Levandowski, S. Gao, Y. Tan, H. Kaijser, H. Lönn, and J. Törnqvist. Safely entering the deep: A review of verification and validation for machine learning and a challenge elicitation in the automotive industry. Journal of Automotive Software Engineering, 1(1), pp. 1-19, 2018.
- [VVM2] N. Humbatova, G. Jahangirova, G. Bavota, V. Riccio, A. Stocco, A., and P. Tonella, P. Taxonomy of real faults in deep learning systems. In Proc. of the ACM/IEEE 42nd Int’l. Conference on Software Engineering, pp. 1110-1121, 2020.
- [VVM3] Assessment List for Trustworthy AI, High-Level Expert Group on AI (AI HLEG), European Commission, https://ec.europa.eu/newsroom/dae/document.cfm?doc_id=68342
- [VVM4] Koopman, P. and Wagner, M., 2016. Challenges in autonomous vehicle testing and
- validation. SAE International Journal of Transportation Safety, 4(1), pp.15-24.
- [VVM5] Abdessalem, R.B., Nejati, S., Briand, L.C. and Stifter, T., 2018, May. Testing vision
- based control systems using learnable evolutionary algorithms. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (pp. 1016-1026). IEEE.
- [VVM6] Gambi, A., Mueller, M. and Fraser, G., 2019, July. Automatically testing self-driving cars with search-based procedural content generation. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (pp. 318-328).
- [VVM7] Stocco, A., Weiss, M., Calzana, M. and Tonella, P., 2020, June. Misbehaviour prediction for autonomous driving systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (pp. 359-371).
- [VVM8] Tian, Y., Pei, K., Jana, S. and Ray, B., 2018, May. DeepTest: Automated testing of deep
- neural-network-driven autonomous cars. In Proceedings of the 40th international conference on software engineering (pp. 303-314).
- [VVM9] Zhang, M., Zhang, Y., Zhang, L., Liu, C., & Khurshid, S. (2018). DeepRoad: Gan-based metamorphic autonomous driving system testing. arXiv preprint arXiv:1802.02295. Improvements