Optimizing Encrypted Neural Networks: Model Design, Quantization and Fine-Tuning Using FHEW/TFHE
Ku, Yu-Te ; Liu, Feng-Hao ; Hsu, Chih-Fan ; Chang, Ming-Ching ; Hung, Shih-Hao ; Tu, I-Ping ; Chen, Wei-Chao
Ku, Yu-Te
Liu, Feng-Hao
Hsu, Chih-Fan
Chang, Ming-Ching
Hung, Shih-Hao
Tu, I-Ping
Chen, Wei-Chao
Supervisor
Department
Computer Science
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Third-generation Fully Homomorphic Encryption (FHE), particularly the FHEW/TFHE schemes, is recognized for its balanced security requirements, small parameters, and low memory usage, though the current methods in the scenarios of Deep Neural Network (DNN) inference still have high computational costs, limiting the practical applicability. This work demonstrates how to improve practicality of the third-generation technologies for DNN tasks while preserving its key advantages. Our work focuses on two main contributions. First, we developed a computational architecture called FHE-Neuron, which reconfigures the parameters and bootstrapping structure of traditional FHEW/TFHE Boolean operations. This architecture significantly reducing the cost of encrypted DNN inference by dynamically switching the precision of encrypted data during computation—using high precision for cost-effective linear operations and low precision for computationally expensive nonlinear operations. Second, we introduced an FHE-aware Quantization and Fine-tuning framework that optimizes model parameters to align with FHE-Neuron’s constraints, ensuring high accuracy in encrypted inference. We validate our approach on various neural network models across several computing platforms. In our experiments, our method achieves one-image inference time on average 4.5 milliseconds for MNIST and 17 milliseconds for Fashion MNIST, achieving accuracy rates of 96.52% and 88.57% respectively. For the CIFAR-10 dataset, our system completes one image inference in 30 seconds with a 90.5% accuracy rate.
Citation
Y.-T. Ku et al., “Optimizing Encrypted Neural Networks: Model Design, Quantization and Fine-Tuning Using FHEW/TFHE,” Proceedings on Privacy Enhancing Technologies, vol. 2025, no. 4, pp. 1075–1091, Oct. 2025, doi: 10.56553/POPETS-2025-0172
Source
Proceedings on Privacy Enhancing Technologies
Conference
Keywords
Encrypted inference, neural networks, Fully Homomorphic Encryption, FHEW/TFHE, boostrapping, quantization, approximated computation, model fine-tuning
Subjects
Source
Publisher
Privacy Enhancing Technologies Board
