This prior seems to favour values of theta close to 0 or 1, so how can it be uninformative?
The intuition they provide is the following: think of Maximum Likelihood estimator as the most uninformative estimate of parameters. Then we can rank informativeness of priors by how closely the Bayesian Posterior Mean estimate comes to the maximum likelihood estimate.
If we observed k heads out of n tosses, we get following estimates:
1. MLE: k/n
2. PM using Jeffrey's prior (k+1/2)/(n+1)
3. Laplace smoothing (Uniform) (k+1)/(n+2)
4. PM using Beta(a,b) prior (k+a)/(n+a+b)
You can see that using this informal criteria for informativeness, Jeffreys' prior is more uninformative than Uniform, and Beta(0,0) (the limiting distribution) is the least informative of all, because then Posterior Mean estimate will coincide with the MLE.
- Zhu, Lu, The Counter-intuitive Non-informative Prior for the Bernoulli Family, Journal of Statistics Education Volume 12, Number 2 (2004), http://www.amstat.org/publications/jse/v12n2/zhu.pdf
- Kass, Wasserman, "The Selection of Prior Distributions by Formal Rules", JASA 96
- Anne Randi Syversveen. "Noninformative Bayesian priors. Interpretation and problems with construction and applications" (1998), unpublished http://www.math.ntnu.no/preprint/statistics/1998/S3-1998.ps