Published July 4, 2020 | Version 1.0
Thesis Open

Variational Parametric Models for Audio Synthesis

  • 1. Indian Institute of Technology Bombay

Contributors

Supervisor:

  • 1. Indian Institute of Technology Bombay

Description

With the advent of data-driven statistical modeling and abundant computing power, researchers are turning increasingly to deep learning for audio synthesis. These methods try to model audio signals directly in the time or frequency domain. In the interest of more flexible control over the generated sound, it could be more useful to work with a parametric representation of the signal which corresponds more directly to the musical attributes such as pitch, dynamics and timbre. These parametric representations also facilitate better musical control of the synthesized output. We present VaPar Synth - a Variational Parametric Synthesizer which utilizes a conditional variational autoencoder trained on a suitable parametric representation. We demonstrate our proposed model's capabilities via the reconstruction and generation of instrumental tones with flexible control over their pitch. We also investigate a parametric model for violin tones, in particular, the generative modeling of the residual bow noise to make for more natural tone quality. To aid in our analysis, we introduce a dataset of Carnatic Violin Recordings where bow noise is an integral part of the playing style of higher-pitched notes in specific gestural contexts. We obtain insights about each of the harmonic and residual components of the signal, as well as their interdependence, via observations on the latent space derived in the course of variational encoding of the spectral envelopes of the sustained sounds.

Files

thesis.pdf

Files (5.5 MB)

Name Size Download all
md5:ec740c0673526c1704d3416a6a994f6f
5.5 MB Preview Download

Additional details

Related works