N. Jojic, N. Petrovic, B. J. Frey and T. S. Huang 2000.
Transformed hidden Markov models: Estimating mixture models of images and inferring spatial transformations in video sequences.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2000, IEEE Computer Society Press, Los Alamatos, CA.
SEE THE VIDEOS (AVI FORMAT):
Processing a noisy sequence,
In this paper we describe a novel generative model for video
analysis called the transformed hidden Markov model (THMM).
The video sequence is
modeled as a set of frames generated by transforming a small number of
class images that summarize the sequence. For each frame, the
transformation and the class are discrete latent variables that depend on
the previous class and transformation in the sequence. The set of possible
transformations is defined in advance, and it can
include a variety of transformation such as translation, rotation and shearing.
In each stage of such a
Markov model, a new frame is generated from a transformed Gaussian
distribution based on the class/transformation combination
generated by the Markov chain. This model can be viewed as an
extension of a transformed mixture of Gaussians through time.
We use this model to cluster unlabeled video segments and
form a video summary in an unsupervised fashion. We also use the trained
models to perform tracking,
image stabilization and filtering. We demonstrate that the THMM is capable of
combining long term dependencies in video sequences (repeating similar
frames in remote parts of the sequence) with short term dependencies (such
as short term image frame similarities and motion patterns) to better
summarize and process a video sequence even in the presence of high
levels of white or structured noise (such as foreground occlusion).
Compressed postscript (.ps.Z),
uncompressed postscript (.ps),
portable document format (.pdf),
Back to Brendan Frey's home page.