Loading...
Thumbnail Image
Item

Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees

Heakl, Ahmed
Hashmi, Sarim
Abi, Chaimaa
Lee, Celine
Mahmoud, Abdulrahman
Supervisor
Department
Computer Science
Embargo End Date
Type
Conference proceeding
Date
License
http://creativecommons.org/licenses/by/4.0/
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
The hardware ecosystem is rapidly evolving, with increasing interest in translating low-level programs across different *instruction set architectures* (ISAs) in a quick, flexible, and correct way to enhance the portability and longevity of existing code. A particularly challenging class of this transpilation problem is translating between complex- (CISC) and reduced- (RISC) hardware architectures, due to fundamental differences in instruction complexity, memory models, and execution paradigms. In this work, we introduce GG (**G**uaranteed **G**uess), an ISA-centric transpilation pipeline that combines the translation power of pre-trained large language models (LLMs) with the rigor of established software testing constructs. Our method generates candidate translations using an LLM from one ISA to another, and embeds such translations within a software-testing framework to build quantifiable confidence in the translation. We evaluate our GG approach over two diverse datasets, enforce high code coverage (>98%) across unit tests, and achieve functional/semantic correctness of 99% on HumanEval programs and 49% on BringupBench programs, respectively. Further, we compare our approach to the state-of-the-art Rosetta 2 framework on Apple Silicon, showcasing 1.73× faster runtime performance, 1.47× better energy efficiency, and 2.41× better memory usage for our transpiled code, demonstrating the effectiveness of GG for real-world CISC-to-RISC translation tasks. We will open-source our codes, data, models, and benchmarks to establish a common foundation for ISA-level code translation research.
Citation
A. Heakl, S. Hashmi, C. Abi, C. Lee, A. Mahmoud, "Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees," 2025, pp. 24474-24488.
Source
Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025
Conference
30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025
Keywords
Subjects
Source
30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025
Publisher
Association for Computational Linguistics (ACL)
Additional links
Full-text link