Data-based Design | Models: RAVE

WORKSHOP: MY AI IS BETTER THAN YOURS WS25

MODELS: RAVE

Lecturer: Kim Albrecht Lars Christian Schmidt Yağmur Uçkunkaya

Winter 2025

What it does: Learns compact features of sound from your dataset and re-generates sound from them in real time.
Media: Sound

RAVE (Realtime Audio Variational autoEncoder) is an auto-encoder for sound: it takes sound as input and is trained to reconstruct that sound as output. It learns a compact latent space (a small set of parameters that represent/describe the audio), then decodes those parameters back into a waveform. Because it runs efficiently, it’s suitable for live performance, realtime sound generation/transformation, and interactive installations.

Workings

1. Collect audio

Any sound material you want the model to learn with: voices, instruments, field recordings, sound archives
Rough guide: ~1 hour for playful experiments; 5 hours+ for more stable and realistic outputs closer to the training material

2. Prep

Use RAVE’s preprocessing script to normalize, trim, and resample audio

3. Train your RAVE model

Train on GPU using RAVE’s built-in tools (command line) or more simply via a Colab Notebook
A Colab to train is ready to use thanks to Hexorcismos
First glitchy results can appear after a few hours; stable and realistic outputs closer to your dataset usually take a couple of days

4. Export & use

Export your model to run as a VST plugin or in Max/Pd via nn~
Map latent parameters to MIDI knobs, sensors
Explore timbre transfer, morphing, or full generative soundscapes

Why Try RAVE?

Timbral play: morph, explore, and “jam” inside a learned latent space of sounds
Realtime generation: CPU-speed sound generation for live performance and installation use
Open Source + Community: open-source code, pre-trained models, tutorials, and support
Build your own evolving audio palette from the recordings you curate

Resources

RAVE GitHub Repository
Hexorcismos Colab for RAVEv2
Example experiment: Serkan Sevilgen – RAVE trained on 28 hours of Istanbul soundscapes: https://www.youtube.com/watch?v=al9ipOLRPEY