Item

SaFliTe: Fuzzing Autonomous Systems via Large Language Models

Zhu, Taohong
Skapars, Adrians
Mackenzie, Fardeen
Kehoe, Declan
Newton, William
Embury, Suzanne M.
Sun, Youcheng
Supervisor
Department
Computer Science
Embargo End Date
Type
Conference proceeding
Date
2026
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Fuzz testing is a widely adopted testing methodology in software engineering that offers efficient means of testing software and identifying vulnerabilities. This paper presents a universal framework aimed at improving the efficiency of fuzz testing for Autonomous Systems (AS), particularly Unmanned Aerial Vehicle (UAV) autonomous systems. At its core is SaFliTe (Safe Flight Testing), a predictive component that evaluates whether a test case meets predefined safety criteria. By leveraging the large language model (LLM) with information about the test objective and the AS state, SaFliTe assesses the relevance of each test case. We evaluated SaFliTe by instantiating it with various LLMs, including GPT-3.5, Mistral-7B, and Llama2-7B, and integrating it into four fuzz testing tools: PGFuzz, DeepHyperion-UAV, CAMBA, and TUMB. These tools are designed specifically for testing autonomous drone control systems. The experimental results demonstrate that, compared to PGFuzz, SaFliTe increased the likelihood of selecting operations that triggered bug occurrences in each fuzzing iteration by an average of 93.1%. Additionally, after integrating SaFliTe, the ability of DeepHyperion-UAV, CAMBA, and TUMB to generate test cases that caused system safety violations increased by 234.5%, 33.3%, and 17.8%, respectively. The benchmark used in evaluation was from CPS-UAV Tool Competition 2024.
Citation
T. Zhu et al., “SaFliTe: Fuzzing Autonomous Systems via Large Language Models,” Lecture Notes in Computer Science, vol. 16045 LNAI, pp. 245–258, 2026, doi: 10.1007/978-3-032-01486-3_20
Source
Lecture Notes in Computer Science
Conference
26th Annual Conference on Towards Autonomous Robotic Systems, TAROS 2025
Keywords
Autonomous System, Fuzzing, Llm, Aircraft Control, Antennas, Flight Testing, Integration Testing, Safety Engineering, Safety Testing, Unmanned Aerial Vehicles (uav), Aerial Vehicle, Autonomous System, Fuzz Testing, Fuzzing, Language Model, Large Language Model, Test Case, Testing Methodology, Testing Software, Flight Control Systems
Subjects
Source
26th Annual Conference on Towards Autonomous Robotic Systems, TAROS 2025
Publisher
Springer Nature
Full-text link