GitHub Homepage

Audio samples for our paper: “Efficient Neural and Numerical Methods for High-Quality Online Speech Spectrogram Inversion via Gradient Theorem”

This webpage provides representative audio samples for clean speech data in WAV format. Each row represents one random fragment from the Librispeech clean test split. Each column represents a model used to generate the WAV directly from the STFT magnitude spectrogram:

See our paper for more details.

Audio Table

Ground Truth Proposed Prev. + Thomas Prev. + direct VOCOS RTISI (50 iter.) RTISI (5 iter.) Strided + LA Strided