The paper uses a representation, shape context, similar to the edgelet to describe the object. The contour of the object is densely sampled and each sample has a descriptor based on the distance and angle distribution of other samples. (One simple question is that this differential meansurement should be naturally invariant to the image plane rotation, but the authors still used an extra appendix to show this property). The distance metric of two samples can be defined based on chi-square distribution.
Given this discriminative representation, the correspondences between two images can be matched and the image-domain warping that best matches the correspondence is obtained (thin plate spline is used here, and I think there are better ones now). The similarity between two images is then defined. However, equation 12 in the paper is somewhat erroneous: the indices of the min operators are accumulated?? Also the warping distance is not so easy to understand, and it may generate big distance even if the object is only uniformly scaled.
The classification system is based on K-NN, and a K-medoids clustering is used for choosing the prototypes. It gives the best classification rate at the time of 2001. Overall the paper is well written and persuadable. Despite the small problems, it opens a new avenue to the object classification.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment