Thesis Committee:
Brendan Frey
(adviser),
Sam Roweis
(adviser),
Aaron Hertzmann,
Parham Aarabi,
Li Deng
This thesis addresses the problem of modeling speech directly in the
time domain and reconstructing time-domain speech signals from
phaseless feature domain representations. Processing of speech in the
time domain is generally not favored because accounting for
variability in phase is not straight-forward. Instead, it is common to
process speech in a feature domain where the phase components have
been removed. However, many applications of speech processing require
that the output be in the time-domain. In this case, speech signals
can be processed in a phase-free feature domain and then transformed
to the time-domain by reconstructing the phase, or they can be
processed directly in the time-domain. In this thesis, we study how to
reconstruct time-domain speech signals from phase-free feature
representations and how to model and analyze speech signals directly
in the time-domain.
Probabilistic models of time-domain speech signals
Kannan Achan
PhD Thesis, Department of Computer Science, University of Toronto, Canada, 2007 [PDF](2.97MB)