Brendan J. Frey and Nebojsa Jojic 2000. Transformation-invariant clustering and dimensionality reduction. Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, Nov. 2000.


Clustering and dimensionality reduction are simple, effective ways to derive useful representations of data, such as images. These procedures often are used as preprocessing steps for more sophisticated pattern analysis techniques. (In fact, these procedures often perform as well as or better than more sophisticated pattern analysis techniques.) However, in situations where each input has been randomly transformed (e.g., by translation, rotation and shearing in images), these methods tend to extract cluster centers and submanifolds that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images of a human face are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different translations and rotations. We describe a way to add transformation invariance to mixture models, factor analyzers and mixtures of factor analyzers by approximating the nonlinear transformation manifold by a discrete set of points. In contrast to linear approximations of the transformation manifold, which assume the amount of transformation is small, our method works well for large levels of transformation. We show how the expectation maximization algorithm can be used to jointly learn a set of clusters, a subspace model, or a mixture of subspace models and at the same time infer the transformation associated with each case. After illustrating this technique on some difficult contrived problems, we compare the technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images of faces into different categories of identification and pose, subspace modeling of facial expressions, subspace modeling of images of handwritten digits for handwriting classification, and unsupervised classification of images of handwritten digits.

Compressed postscript, uncompressed postscript.

Back to Brendan Frey's home page.