Loading...
Thumbnail Image
Item

PIXEL: Adaptive Steering Via Position-wise Injection with eXact Estimated Levels under a Subspace Calibration

Yu, Manjiang
Li, Hongji
Singh, Priyanka
Li, Xue
Wang, Di
Hu, Lijie
Supervisor
Department
Machine Learning
Embargo End Date
Type
Conference proceeding
Date
License
http://creativecommons.org/licenses/by/4.0/
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Reliable behavior control is central to deploying Large Language Models (LLMs) on the web. Activation steering offers a tuning-free route to align attributes (e.g., truthfulness) that ensure trustworthy generation. Prevailing approaches rely on coarse heuristics and lack a principled account of where to steer and how strongly to intervene. To this end, we propose Position-wise Injection with eXact Estimated Levels (PIXEL), a position-wise activation steering framework that, in contrast to prior work, learns a property-aligned subspace from dual views (tail-averaged and end-token) and selects intervention strength via a constrained geometric objective with a closed-form solution, thereby adapting to token-level sensitivity without global hyperparameter tuning. PIXEL further performs sample-level orthogonal residual calibration to refine the global attribute direction and employs a lightweight position-scanning routine to identify receptive injection sites. We additionally provide representation-level guarantees for the minimal-intervention rule, supporting reliable alignment. Across diverse models and evaluation paradigms, PIXEL consistently improves attribute alignment while preserving model general capabilities, offering a practical and principled method for LLMs' controllable generation. Our code is available at https://anonymous.4open.science/r/PIXEL-Adaptive-Steering-95DC
Citation
M. Yu, H. Li, P. Singh, X. Li, D. Wang, L. Hu, "PIXEL: Adaptive Steering Via Position-wise Injection with eXact Estimated Levels under a Subspace Calibration," 2026, pp. 1574-1585.
Source
Conference
ACM Web Conference 2026
Keywords
46 Information and Computing Sciences, 4607 Graphics, Augmented Reality and Games
Subjects
Source
ACM Web Conference 2026
Publisher
Association for Computing Machinery
Full-text link