WORKSHOP: MY AI IS BETTER THAN YOURS WS25
MODELS: RAVE
Lecturer: Kim Albrecht Lars Christian Schmidt Yağmur Uçkunkaya
Winter 2025
What it does: Learns compact features of sound based on your dataset and re-generates sound from them in real time.
Media: Sound
RAVE (Realtime Audio Variational autoEncoder) is an auto-encoder: it takes sound as input and is trained to reconstruct that sound as output. During its training, it learns a compact latent space (a small set of parameters that describe and represent the features of the sound), then decodes those parameters back into a waveform. Because it runs efficiently, it’s suitable for live performance, realtime sound generation and interactive installations.
Workings
-
Collect audio
- Any material you want the model to learn: voices, instruments, field recordings, effects, loops.
-
Build & preprocess your dataset
- A practical start with a minimum of 60 minutes of audio. If possible more works too.
- Preprocess the dataset with the provided script to all (give here features)
- Train your RAVE model
- Train on your own dataset.
- Colab option: RAVEv2 Colab by Hexorcismos / Moisés Horta Valenzuela (GPU).
- Rough guide: small datasets can finish in a few hours on a single GPU (great for first, glitchy experiments). Larger or cleaner datasets may need overnight to multi-day training (≈ 1–4 days) for more stable quality.
-
Export & use the model
- As a VST in a DAW, or in Max/Pure Data via
nn~
.
- As a VST in a DAW, or in Max/Pure Data via
-
Perform / integrate
- Use live (performance), in installations, or pipe into other interactive systems.
- Explore timbre transfer, sound morphing, latent improvisation etc.
Why Try RAVE?
- Real-time audio generation & transformation
- Timbre transfer: re-express one sound through another learned sound features
- Effect + Synth in one: transform inputs or generate directly from latents
- Open source: code, models, tutorials, and community
Resources
- RAVE GitHub Repository (IRCAM): https://github.com/acids-ircam/RAVE
- RAVE v2 Colab (Hexorcismos / Moisés Horta Valenzuela): https://colab.research.google.com/drive/1ih-gv1iHEZNuGhHPvCHrleLNXvooQMvI?usp=sharing
- Ircam RAVE experiment with a custom model trained with 28 hrs of Istanbul soundscape recordings by Serkan Sevilgen: https://www.youtube.com/watch?v=al9ipOLRPEY&ab_channel=SerkanSevilgen