Unstable Audio code and examples

Nick Collins

Home [Music] Research {Software} [Teaching] Contact

To accompany the paper:

(2025) [PDF] [sound examples and code] "Unstable Audio: Code Bending Text-to-Music Generation". AES International Conference on Machine Learning and Artificial Intelligence for Audio, London, September 8-10
Associated github project with code for hacks and for the figures in the paper [UnstableAudio]

Sound Examples

Original text prompts and outputs

baseline generation for perturbations below:
'George Clinton and Kraftwerk are stuck in an elevator with only a sequencer to keep them company'

Alternative prompt example renders:
'Detroit techno circa 1988'
'Derrick May electronic dance music'
'early techno music'
'late 1980s Detroit techno'

Examples of perturbing weights at particular layers:

Following the paper, which0 = 0.03 which1 = 0.3 which2 = 3

Layer 3, all 3 pertubation sizes

Layer 240, all 3 pertubation sizes

Layer 708, all 3 pertubation sizes

Examples of diverse processing:

voices appear even though the model is trained on instrumentals:

ambient glitch:

scrunchy squelchy semi-vocal spasm:

Isolated hits with silence inbetween:

In a reverberant station setting/public service announcement, with squeals:

sustained industrial noises: