Brendan J. Frey and Nebojsa Jojic 1999.
Estimating mixture models of images and
inferring spatial transformations using the EM algorithm.
In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition 1999,
Ft. Collins, CO. IEEE Computer Society Press: Los Alamitos, CA.
Mixture modeling and clustering algorithms are effective, simple ways to
represent images using a set of data centers. However, in situations where
the images include background clutter and transformations such as
translation, rotation, shearing and warping, these methods extract data
centers that include clutter and represent different transformations of
essentially the same data. Taking face images as an example, it would
be more useful for the different clusters to represent different
poses and expressions, instead of cluttered versions of different translations,
scales and rotations. By including clutter and transformation as unobserved,
latent variables in a mixture model, we obtain a new ``transformed
mixture of Gaussians'', which is invariant to a specified set of
transformations. We show how a linear-time EM algorithm can be used to fit
this model by jointly estimating a mixture model for the data and inferring
the transformation for each image. We show that this algorithm can
jointly align images of a human head and learn different poses. We also
find that the algorithm performs better than k-nearest neighbors and
mixtures of Gaussians on handwritten digit recognition.
Compressed postscript,
uncompressed postscript.
Back to Brendan Frey's home page.