Nick Collins
Home
[Music]
Research
{Software}
[Teaching]
Contact
To accompany the paper:
(2025) [PDF] [sound examples and code] "Unstable Audio: Code Bending Text-to-Music Generation". AES International Conference on Machine Learning and Artificial Intelligence for Audio, London, September 8-10
Associated github project with code for hacks and for the figures in the paper [UnstableAudio]
Sound Examples
Original text prompts and outputs
baseline generation for perturbations below:
'George Clinton and Kraftwerk are stuck in an elevator with only a sequencer to keep them company'
Alternative prompt example renders:
'Detroit techno circa 1988'
'Derrick May electronic dance music'
'early techno music'
'late 1980s Detroit techno'
Examples of perturbing weights at particular layers:
Following the paper, which0 = 0.03 which1 = 0.3 which2 = 3
Layer 3, all 3 pertubation sizes
Layer 240, all 3 pertubation sizes
Layer 708, all 3 pertubation sizes
Examples of diverse processing:
voices appear even though the model is trained on instrumentals:
ambient glitch:
scrunchy squelchy semi-vocal spasm:
Isolated hits with silence inbetween:
In a reverberant station setting/public service announcement, with squeals:
sustained industrial noises: