Probabilistic models of time-domain speech signals




Thesis Committee: Brendan Frey (adviser), Sam Roweis (adviser), Aaron Hertzmann, Parham Aarabi, Li Deng  

This thesis addresses the problem of modeling speech directly in the time domain and reconstructing time-domain speech signals from phaseless feature domain representations. Processing of speech in the time domain is generally not favored because accounting for variability in phase is not straight-forward. Instead, it is common to process speech in a feature domain where the phase components have been removed. However, many applications of speech processing require that the output be in the time-domain. In this case, speech signals can be processed in a phase-free feature domain and then transformed to the time-domain by reconstructing the phase, or they can be processed directly in the time-domain. In this thesis, we study how to reconstruct time-domain speech signals from phase-free feature representations and how to model and analyze speech signals directly in the time-domain.


Probabilistic models of time-domain speech signals
Kannan Achan
PhD Thesis, Department of Computer Science, University of Toronto, Canada, 2007 [PDF](2.97MB)










kannan@psi.toronto.edu
$date 2007