B. Frey and N. Jojic 2000.
Learning graphical models of images, videos and their spatial transformations.
In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, San Francisco, CA.
Mixtures of Gaussians, factor analyzers (probabilistic PCA)
and hidden Markov models are
staples of static and dynamic data modeling and image and video modeling
in particular. We show how topographic transformations in the input,
such as translation and shearing in images, can be accounted for in these
models by including a discrete transformation variable. The resulting models
perform clustering, dimensionality reduction and time-series analysis in
a way that is invariant to transformations in the input.
Using the EM algorithm, these transformation-invariant models
can be fit to static data and time series.
We give results on filtering microscopy
images, face and facial pose clustering,
handwritten digit modeling and recognition,
video clustering, object tracking, and removal of distractions
from video sequences.
Compressed postscript (.ps.Z),
uncompressed postscript (.ps),
portable document format (.pdf).
Back to Brendan Frey's home page.