next up previous index
Next: Labeling with Contextual Constraints Up: Visual Labeling Previous: The Labeling Problem

1.1.3

Labeling Problems in Vision

In terms of the regularity and the continuity, we may classify a vision labeling problem into one of the following four categories:    

LP1: Regular sites with continuous labels.
LP2: Regular sites with discrete labels.
LP3: Irregular sites with discrete labels.
LP4: Irregular sites with continuous labels.

The first two categories characterize low level processing performed on observed images and the other two characterise high level processing on extracted token features. The following describes some vision problems in terms of the four categories.

Restoration or smoothing of images having continuous pixel values is an LP1. The set of sites corresponds to image pixels and the set of labels is a real interval. The restoration is to estimate the true image signal from a degraded or noise-corrupted image.

Restoration of binary or multi-level images is an LP2. Similar to the continuous restoration, the aim is also to estimate the true image signal from the input image. The difference is that each pixel in the resulting image here assumes a discrete value and thus in this case is a set of discrete labels.

Region segmentation is an LP2. It partitions an observation image into mutually exclusive regions, each of which has some uniform and homogeneous properties whose values are significantly different from those of the neighboring regions. The property can be, for example, grey tone, color or texture. Pixels within each region are assigned a unique label.

The prior assumption in the above problems is that the signal is smooth or piecewise smooth. This is complementary to the assumption of abrupt changes made for edge detection.

Edge detection is also an LP2. Each edge site, located between two neighboring pixels, is assigned a label in {edge, non-edge} if there is a significant difference between the two pixels. Continuous restoration with discontinuities can be viewed as a combination of LP1 and LP2.

Perceptual grouping is an LP3. The sites usually correspond to initially segmented features (points, lines and regions) which are irregularly arranged. The fragmentary features are to be organized into perceptually more significant features. Between each pair of the features is assigned a label in {connected,disconnected}, indicating whether the two features should be linked.

Feature-based object matching and recognition is an LP3. Each site indexes an image feature such as a point, a line segment or a region. Labels are discrete in nature and each of them indexes a model feature. The resulting configuration is a mapping from the image features to those of a model object.

Pose estimation from a set of point correspondences might be formulated as an LP4. A site is a given correspondence. A label represents an admissible (orthogonal, affine or perspective) transformation. A prior (unary) constraint is that the label of transformation itself must be orthogonal, affine or perspective. A mutual constraint is that the labels should be close to each other to form a consistent transformation.

For a discrete labeling problem of m sites and M labels, there exist a total number of possible labelings. For a continuous labeling problem, there are an infinite number of them. However, among all the possibilities, there are only a few which are optimal in terms of a criterion measuring the goodness (or inversely, the cost) of solutions. This is the optimization approach to visual labeling.  



next up previous index
Next: Labeling with Contextual Constraints Up: Visual Labeling Previous: The Labeling Problem