WORKSHOP: MY AI IS BETTER THAN YOURS WS25

MODELS: RAVE

Lecturer: Kim Albrecht Lars Christian Schmidt Yağmur Uçkunkaya

Winter 2025

What it does: Learns compact features of sound based on your dataset and re-generates sound from them in real time.
Media: Sound

RAVE (Realtime Audio Variational autoEncoder) is an auto-encoder: it takes sound as input and is trained to reconstruct that sound as output. During its training, it learns a compact latent space (a small set of parameters that describe and represent the features of the sound), then decodes those parameters back into a waveform. Because it runs efficiently, it’s suitable for live performance, realtime sound generation and interactive installations.

Workings

  1. Collect audio

    • Any material you want the model to learn: voices, instruments, field recordings, effects, loops.
  2. Build & preprocess your dataset

    • A practical start with a minimum of 60 minutes of audio. If possible more works too.
  1. Train your RAVE model
    • Train on your own dataset.
  1. Export & use the model

    • As a VST in a DAW, or in Max/Pure Data via nn~.
  2. Perform / integrate

    • Use live (performance), in installations, or pipe into other interactive systems.
    • Explore timbre transfer, sound morphing, latent improvisation etc.

Why Try RAVE?

Resources