1.5.1

In Bayes estimation, a risk is minimized to obtain the optimal estimate. The Bayes risk of estimate is defined as

where **d** is the observation, is a cost function and is the posterior distribution. First of all, we need to compute
the posterior distribution from the prior and the likelihood. According
to the Bayes rule, the posterior probability can be computed by using
the following formulation

where is the prior probability of labelings **f**,
is the conditional p.d.f. of the observations **d**, also called the
likelihood function of **f** for **d** fixed,
and is the density of **d** which is a constant when **d** is given.

**Figure 1.6:** Two choices of cost functions.

The cost function determines the cost of estimate **f** when
the truth is . It is defined according to our preference. Two
popular choices are the quadratic cost function

where is a distance between **a** and **b**, and the
(0-1) cost function

where is any small constant. A plot of the two cost functions are shown in Fig.1.6.

The Bayes risk under the quadratic cost function measures the variance of the estimate

Letting , we obtain the minimal variance estimate

The above is the mean of the posterior probability.

For the cost function, the Bayes risk is

When , the above is approximated by

where is the volume of the space containing all points **f** for
which . Minimizing the above is equivalent to
maximizing the posterior probability. Therefore, the minimal risk
estimate is

which is known as the MAP estimate. Because in
(1.70) is a constant for a fixed **d**, is
proportional to the joint distribution

Then the MAP estimate is equivalently found by

Obviously, when the prior distribution, , is flat, the MAP is equivalent to the maximum likelihood.