The Total Belief Theorem



the problem   workplan   restricted case   references




A generalization of the total probability theorem




Decision making and estimation are common problems in most applied sciences, as people or machines need to make inferences about the state of the external world, and take appropriate actions.
Normally the state of the world is assumed to be described by a probability distribution over a set of alternative hypotheses. Decision and estimation then require to estimate this distribution from the data. Sometimes, however, like in the case of extremely rare events (e.g., a vulcanic eruption), few statistics are available to drive the estimation. Observed data can be missing. Furthermore, under the law of large numbers, probability distributions are the outcome of an infinite process of evidence accumulation, while in all practical cases the available evidence only provides some sort of constraint on the unknown, “true” probability governing the process.

Different kinds of constraints are associated with different generalizations of probabilities, formulated to model uncertainty at the level of probability distributions. The simplest such objects are, possibly, interval probabilities [1] and convex sets of probabilities or “credal sets” [2]. A battery of different uncertainty theories [3], however, has been developed in the last century or so, starting from De Finetti's pioneering work [4]. Shafer's theory of belief functions [5] (BFs, based on A. Dempster's [6] seminal work) allows us to express partial belief by providing lower and upper bounds to its values. Uncertainty is often treated by scientists in the Bayesian framework, typically by imposing additional assumptions in order to pick a specific a-priori distribution. Its widespread effects, though, explain why belief functions have been increasingly applied to fields as diverse as robotics, economics, or machine vision, while powerful tools for decision making with BFs have been proposed [7] [8] [9].

When observations come from time series, however, or when conditional independence assumptions need to be applied to simplify the structure of a joint distribution, the need of generalizing the classical results on total probability arises. “Graphical models” are widely employed, for instance, in object recognition and image segmentation [10], where conditional independence is crucial to make optimization tractable. In target tracking, conditional constraints on the targets' future position given their past locations are available. If such constraints are described by belief functions, predicting current target locations requires combining conditional BFs in a total function [11,12]. Different definitions of conditional belief functions have been proposed in the past by Jaffray [13], Denneberg [14], Smets [15], Kyburg [16], and others. As explored by the proposer, conditional belief functions can be also defined by minimizing appropriate geometric distances [17]. The most common approach, though, is to use conditional BFs induced by Dempster's original evidence combination rule, a generalization of Bayes' rule to belief functions. Whatever the chosen definition of conditional BF, the total probability problem for belief functions reads as follows.

Consider a set S and a disjoint partition P of S. Suppose a belief function is defined on each element of the partition, and a (a-priori) BF is given on P itself. We seek a total belief function on the whole of S whose restriction to P coincides with the a-priori, and whose conditional versions (under Dempster's conditioning, for instance, but not necessarily) coincide with the given ones for all the elements of the partition P of S.


Preliminary studies by the proposer suggest that, unlike the classical case, this total belief theorem has several admissible solutions which form the nodes of a graph endowed with a number of interesting symmetries. Links with the theory of positive linear systems and transversal matroids [18] also emerge. In the more general context of “coherent lower previsions” [3] the problem goes under the name of “marginal extension” [19]. However, marginal extension produces results which are not closed in the BF framework, as the marginal extension of two belief functions is in general a coherent lower prevision. In the course of this project we plan to develop a coherent theory of total probability for belief functions.

Workplan of the project








Approaching the restricted total belief theorem




If we force the a-priori function to have only disjoint focal elements (i.e. it is the vacuous extension of a Bayesian function on some coarsening of P), we have what we can call the restricted total belief theorem. This is the case of the data association problem, when the a-priori belief function is usually a simple support function with core containing a few elements. In this case we can solve the simplified problem by considering each focal element of the a-priori (marginal) BF separately, to later combine the sub-solutions by simply normalizing the resulting basic probability assignments.

It can be proven that, in the restricted case, candidate solutions correspond to linear systems of equations, which in turn form a solution graph. The Figure below illustrates two examples of such graphs and their apparent symmetries.


More details can be found (for the time being) in my Ph.D. thesis, Chapter 7.



Bibliography




[1] L. de Campos, J. Huete, and S. Moral, Probability intervals: a tool for uncertain reasoning, Int. J. Uncertainty Fuzziness Knowledge-Based Syst., Vol. (1994), 167–196
[2] I. Levi, The enterprise of knowledge, MIT Press, 1980
[3] P. Walley, Statistical reasoning with imprecise probabilities, and Hall, 1991
[4] B. De Finetti, Theory of probability, 1974
[5] G. Shafer, A mathematical theory of evidence, Princeton University Press, 1976
[6] A. P. Dempster, Upper and lower probabilities induced by a multivalued mapping, of Mathematical Statistics, Vol. (2), 325-339, 1967
[7] T. M. Strat, Decision analysis using belief functions, International Journal of Approximate Reasoning, Vol. 4, pp. 391-417, 1990
[8] Ph. Smets, Decision making in the TBM: the necessity of the pignistic transformation, Journal of Approximate Reasoning, Vol. 38(2), pp. 133-147, 2005
[9] M. C. M. Troffaes, Decision making under uncertainty using imprecise probabilities, Int. J. Approx. Reasoning, Vol. 45(1), pp. 17-29, 2007
[10] P. Kohli and Ph. Torr, Efficiently solving dynamic Markov random fields using graph cuts, Proc. of ICCV'05, Vol. 2, pp. 922-929, 2005
[11] F. Cuzzolin, Visions of a generalized probability theory, Ph.D. Dissertation, U. Padua, 2001
[12] S. Moral and N. Wilson, Importance Sampling Monte-Carlo Algorithms for the Calculation of Dempster-Shafer Belief, Proc. of IPMU'96, 1996
[13] J. Y. Jaffray, Bayesian updating and belief functions, Transactions on Systems, Man and Cybernetics, Vol. 22, pp. 1144–1152, 1992
[14] D. Denneberg, Conditioning (updating) non-additive probabilities, Ann. Operations Res., Vol. 52, pp. 21–42, 1994
[15] Ph. Smets, Belief functions : the disjunctive rule of combination and the generalized Bayesian theorem, Journal of Approximate Reasoning, Vol. 9, pp. 1–35, 1993
[16] H. E. Kyburg, Bayesian and non-Bayesian evidential updating, Intelligence, Vol. 31, pp. 271–293, 1987
[17] F. Cuzzolin, Geometric conditioning of belief functions, Belief'10, 2010
[18] J. Oxley, Matroid theory, Oxford University Press, 1992
[19] E. Miranda and G. de Cooman, Marginal extension in the theory of coherent lower previsions, International Journal of Approximate Reasoning, Vol. 46(1), pp. 188-225, September 2007
[20] P. Suppes and M. Zanotti, On using random relations to generate upper and lower probabilities, Synthese, Vol. 36, pp. 427–440, 1977
[21] P. Smets, Jeffrey’s rule of conditioning generalized to belief functions, Proceedings of UAI'93, pp. 500–505, 1993
[22] Y. Bar-Shalom, Tracking and data association, Academic Press, 1987
[23] F. Cuzzolin and P. Snow (Eds.), Special Issue on “Competing fusion methods: real-world performances”, Information Fusion Journal, 2010