Loading...
Thumbnail Image
Item

INSTRUCT-MUSICGEN: UNLOCKING TEXT-TO-MUSIC EDITING FOR MUSIC LANGUAGE MODELS VIA INSTRUCTION TUNING

Zhang, Yixiao
Ikemiya, Yukara
Choi, Woosung
Murata, Naoki
Martínez-Ramírez, Marco A.
Lin, Liwei
Xia, Gus
Liao, Weihsiang
Mitsufuji, Yuki
Simon Dixon, Simon
Research Projects
Organizational Units
Journal Issue
Abstract
The task of text-to-music editing, which employs text queries to modify music (e.g. by changing its style or adjusting instrumental components), presents unique challenges and opportunities for AI-assisted music creation. Previous approaches in this domain have been constrained by the necessity to train specific editing models from scratch, which is both resource-intensive and inefficient; other research uses large language models to predict edited music, resulting in imprecise audio reconstruction. In this paper, we introduce Instruct-MusicGen, a novel approach that finetunes a pretrained MusicGen model to efficiently follow editing instructions such as adding, removing, or separating stems. Our approach involves a modification of the original MusicGen architecture by incorporating a text fusion module and an audio fusion module, which allow the model to process instruction texts and audio input con-currently and yield the desired edited music. Remarkably, although Instruct-MusicGen only introduces ∼8% new parameters to the original MusicGen model and only trains for 5K steps, it achieves superior performance across all tasks compared to existing baselines. This advancement not only enhances the efficiency of text-to-music editing but also broadens the applicability of music language models in dynamic music production environments.1 2.
Citation
Y. Zhang, Y. Ikemiya, W. Choi, N. Murata, M. A. Martínez-Ramírez, L. Lin, G. Xia, W.-H. Liao, Y. Mitsufuji, and S. Dixon, “Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning,” in Proc. of the 26th Int. Society for Music Information Retrieval Conf., Daejeon, South Korea, 2025.
Source
Proceedings of the International Society for Music Information Retrieval Conference
Conference
26th International Society for Music Information Retrieval Conference (ISMIR 2025)
Keywords
Subjects
Source
26th International Society for Music Information Retrieval Conference (ISMIR 2025)
Publisher
International Society for Music Information Retrieval
Full-text link