Flobject Analysis



Related Publications

Learning Better Image Representations Using 'Flobject Analysis' pdf
Supplementary Materials
P.S. Li*, I.E. Givoni*, and B.J. Frey
(* joint first authors)
To be presented at CVPR 2011.


What is Flobject Analysis?

Unsupervised learning can be used to extract image representations that are useful for various and diverse vision tasks. After noticing that most biological vision systems for interpreting static images are trained using disparity information, we developed an analogous framework for unsupervised learning. The output of our method is a model that can generate a vector representation or descriptor from any static image. However, the model is trained using pairs of consecutive video frames, which are used to find representations that are consistent with optical flow-derived objects, or 'flobjects'. To demonstrate the flobject analysis framework, we extend the latent Dirichlet allocation bag-of-words model to account for real-valued word-specific flow vectors and image-specific probabilistic associations between flow clusters and topics. We show that the static image representations extracted using our method can be used to achieve higher classification rates and better generalization than standard topic models, spatial pyramid matching and gist descriptors.


Data Sets

CityCars Dataset

The dataset includes 315 image pairs shot in an urban scene containing moving cars (positive examples) and 338 images shot in the same environment but without cars (negative examples). We ensured that many of the negative examples were recorded at the same locations as the positive examples. Images are in grayscale .png format and are each 216x384 pixels. A dense optical flow is provided for each image pair as a MATLAB file, calculated using the "Lucas/Kanade meets Horn/Schunck" method.
CityPedestrians Dataset

The dataset includes 938 image pairs of side views of pedestrians walking in an urban environment, as well as 456 static images without pedestrians for use as negative examples. We ensured that many of the negative examples were recorded at the same locations as the positive examples. Images are in grayscale .png format and are each 216x384 pixels. A dense optical flow is provided for each image pair as a MATLAB file, calculated using the "Lucas/Kanade meets Horn/Schunck" method.