|IJCAI 2016 Tutorial: Belief functions for the working scientist|
Previous UAI 2015 tutorial on YouTube
IJCAI 2016 PDF SLIDES
All fields of Artificial Intelligence and applied science are subject to various degrees of uncertainty,
caused by missing or scarce data: random sets and their subjective incarnations called belief functions
naturally arise when one tries to formalize lack of data in a coherent way. This tutorial introduces to the wider
AI audience the basic principles of the theory of belief functions, describes practical tools based on them and compares their
performance against more classical approaches to uncertainty on cutting edge real-world problems.
The theory of belief functions, sometimes referred to as evidence theory or Dempster-Shafer theory, was first introduced by Arthur P. Dempster in the context of statistical inference, and was later developed by Glenn Shafer as a general framework for modelling epistemic uncertainty. Belief theory and the closely related random set theory form a natural framework for modelling situations in which data are missing or scarce: think of extremely rare events such as volcanic eruptions or power plant meltdowns, problems subject to huge uncertainties due to the number and complexity of the factors involved (e.g. climate change), but also the all-important issue with generalisation from small training sets in machine learning.
This tutorial is designed to introduce the principles and rationale of random sets and belief function theory to the wider AI audience, survey the key elements of the methodology and the most recent developments, make AI practitioners aware of the set of tools that have been developed for reasoning in the belief function framework on real-world problems.
In recent years the number of papers published on the theory and application of belief functions has been booming (reaching over 1200 in 2014 alone, see the histogram below), displaying strong growth in particular in the East Asian community and among practitioners working on multi-criteria decision making, earth sciences (GIS), and sensor fusion.
Indeed, belief functions are a natural tool to cope with heavy uncertainty, lack of evidence and missing data, and extremely rare events, issues which are handled far less naturally in standard probability. An early debate on the rationale of belief functions gave a strong contribution to the growth and success of the AI community and series of conference in the Eighties and Nineties, thanks to the contribution of scientists of the caliber of Glenn Shafer, Judea Pearl, Philippe Smets and Prakash Shenoy, among others. Ever since the wider AI and uncertainty theory community have somewhat diverged. The proposer's work has been recently directed towards going back to a closer relationships and exchange of ideas between the two groups. This was a stated aim of the recent BELIEF 2014 International Conference of which the proposer was General Chair:
which led to closer collaboration with the UAI and FUSION communities, as a promising initial result. A number of books are being published on the subject as we speak by several key researchers (including the presenter), and the impact of the belief function approach to uncertainty is growing, as attested by the growing number of (especially application) papers on the subject.
This IJCAI 2016 tutorial aims at bridging the gap between core researchers in the field and the wider AI and community, with the longer term goal of a more fruitful cross-fertilisation among these related fields. It will give non-expert AI scientists the necessary references and instruments to investigate the approach in more detail, either by themselves or with the active support of the proposer.
We see this IJCAI proposal as part of a consistent effort by the proposer directed at disseminating among AI scholars the knowledge of the existence of an array of useful tools based on belief calculus and bridging the gap between sister disciplines such as AI and uncertainty theory, which will likely occupy us for years to come.
Uncertainty is of paramount importance in artificial intelligence, applied science, and many other areas of human endevour. Whilst each and every one of us possesses some intuitive grasp on what uncertainty is, providing a formal definition can prove elusive. Uncertainty can be understood as lack of information about an issue of interest for a certain agent (e.g., a human decision maker, or a machine), a condition of limited knowledge in which it is impossible to exactly describe the state of the world or its future evolution.
According to Dennis Lindley :
“There are some things that you know to be true, and others that you know to be false; yet, despite this extensive knowledge that you have, there remain many things whose truth or falsity is not known to you. We say that you are uncertain about them. You are uncertain, to varying degrees, about everything in the future; much of the past is hidden from you; and there is a lot of the present about which you do not have full information. Uncertainty is everywhere and you cannot escape from it”.
What is sometimes less clear to scientists themselves, is the existence of a iatus between two fundamentally distinct forms of uncertainty. The first level consists of somewhat ‘predictable’ variations, which are typically encoded as probability distributions. For instance, if a person plays a fair roulette they will not, by all means, know the outcome in advance, but they will nevertheless be able to predict the frequency by which each outcome manifests itself (1/36), at least in the long run.
The second level is about ‘unpredictable’ variations, which reflect a more fundamental uncertainty about the laws which themselves govern the outcome. Following on with our example, suppose the player is presented with ten different doors, which lead to rooms containing a roulette modelled by a different probability distribution. They will then be uncertain about the very game they are suppose to play. How will this affect they betting behaviour, for instance?
Lack of knowledge of the second kind is often called Knightian uncertainty [1328, 1110], from Chicago economist Frank Knight. He would famously distinguish ‘risk’ from ‘uncertainty’:
“Uncertainty must be taken in a sense radically distinct from the familiar notion of risk, from which it has never been properly separated.... The essential fact is that ‘risk’ means in some cases a quantity susceptible of measurement, while at other times it is something distinctly not of this character; and there are far-reaching and crucial differences in the bearings of the phenomena depending on which of the two is really present and operating.... It will appear that a measurable uncertainty, or ‘risk’ proper, as we shall use the term, is so far different from an unmeasurable one that it is not in effect an uncertainty at all.”
In Knights terms, ‘risk’ is what people normally call probability or chance, while the term ‘uncertainty’ is reserved for second-order uncertainty. The latter has a measurable consequence on human behaviour: people are demonstrably averse to unpredictable variations (as highlighted by Ellsberg’s paradox ).
This difference between predictable and unpredictable variation is one of the fundamental issues in the philosophy of probability, and is sometimes referred to as distinction between common-cause and special-cause . Different interpretations of probability treat these two aspects of uncertainty in different ways. Economists John Maynard Keynes  and G. L. S. Shackle have also contributed to this debate.
A long series of students have argued that a number of serious issues arise whenever uncertainty is handled via Andrey Kolmogorov’s measure-theoretic probability theory. On top of that, one can argue that something is wrong with both mainstream approaches to probability interpretation. Before we move on to introduce the mathematics of belief functions and other alternative theories of uncertainty, we think best to briefly summarise our own take on these issues here.
Flaws of the frequentistic setting. The setting of frequentist hypothesis testing is rather arguable. First of all, its scope is quite narrow: rejecting or not rejecting a hypothesis (although confidence intervals can also be provided). The criterion according to which this decision is made is arbitrary: who decides what an ‘extreme’ realisation is? In other words, who decides what is the right choice for the value of ? What is the deal with ‘magical’ numbers such as 0.05 and 0.01? In fact, the whole ‘tail event’ idea derives from the fact that, under measure theory, the conditional probability (p-value) of a point outcome is zero – clearly, the framework seems to be trying to patch up what is instead a fundamental problem with the way probability is mathematically defined. Last but not least, hypothesis testing cannot cope with pure data, without making additional assumptions on the process (experiment) which generates them.
The issues with Bayesian reasoning. Bayesian reasoning is also flawed in a number of ways. It is extremely bad at representing ignorance: Jeffrey's uninformative priors  (e.g., in finite settings, uniform probability distributions over the set of outcomes), the common way of handling ignorance in a Bayesian setting, lead to different results for different reparameterisations of the universe of discourse. Bayes’ rule assumes the new evidence comes in the form of certainty: ‘A is true’: in the real world, often this is not usually the case. As pointed out by Klir, a precise probabilistic model defined only on some class of events only determines interval probabilities for events outside that class. Finally, model selection is troublesome in Bayesian statistics: whilst one is forced by the mathematical formalism to pick a prior distribution, there is no clearcut criterion on how to actually do that. In the Author’s view, this is the result of a fundamental confusion between the original Bayesian description of a person’s subjective system of beliefs and the way it is updated, and the ‘objectivist’ view of Bayesian reasoning as a rigorous procedure for updating probabilities when presented with new information.
We expect participants to come from all areas of Artificial Intelligence, in the first place those directly involved with uncertainty theory, but also experts in Bayesian reasoning keen on reaching out towards alternative approaches, and practitioners from all fields of applied science, especially from those application sectors in which heavy uncertainty and missing data are a real issue (climate change, policy-making, and many others). The video of the preliminary version which appeared at UAI has so far hit 606 views.
Attendees will learn about the principles of belief function theory, its rationale (especially in comparison with Bayesian reasoning and other approaches to the representation of uncertainty), a complete set of tools for reasoning in the belief function framework (conditioning, inference, graphical models, efficient computation, decision making, regression and classification, and so on). They will be made aware of the most recent methodological developments in the field, and will acquire first-hand knowledge of how to apply these tools to significant problems in the fields of computer vision, climate change, and others. The performance of such approaches will be critically compared with those off more classical regression, classification or estimation methods to highlight the advantage of modelling lack of data explicitly.
At the end of the tutorial, the audience should walk away with a much better awareness of the widespread presence of uncertainty in all fields of science, what the possible approaches to its mathematical representation are, where do belief functions sit in this context. They will be aware of the fact that a complete battery of tools for dealing with inference, decision, regression and classification based on belief functions is available. They will be provided with online resources helping them to investigate the topic further or apply existing algorithms to their problems of choice. As a result, they might want to reconsider their approaches to their applications of interest in the light of what they have learned.
We will not assume any prior knowledge of the subject, although of course some background on basic statistical methods (as normally expected from a IJCAI audience) will be a desirable and will possibly help the audience gain the most from this tutorial. In any case, we will start from scratch when introducing the subject, and make ample use of examples to illustrate the most critical points.
Professor Fabio Cuzzolin is the Head of the Visual Artificial Intelligence Laboratory at Oxford Brookes University, Oxford, UK. The group currently includes five members of staff, a KTP associate, three PhD students and four MSc students. Dr Cuzzolin is a recognised world expert in uncertainty theory and belief functions theory. He worked extensively for 15 years on the mathematical foundations of belief calculus. His main contribution there is a geometric approach to uncertainty measures, in which uncertainty measures are represented as points of a Cartesian space and there analyzed.
His work in this field has appeared or is in the process of appearing in two separate monographs, published by Springer-Verlag (The geometry of uncertainty, to appear in June 2016 under the Artificial Intelligence: Foundations, Theory, and Algorithms series):
and Lambert Academic Publishing (Visions of a generalized probability theory, September 2014):
Dr Cuzzolin was the General Chair of the 3rd International Conference on Belief Functions (BELIEF 2014) held in St. Hugh college, Oxford in September 2014, and is Guest Editor of IJAR for the Special Issue dedicated to the conference, to be released in 2016. He has served three terms as member of the Board of Directors of the Belief Functions and Applications Society (BFAS).
Dr Cuzzolin is also a senior associate member of the world-leading Torr Vision Group at Oxford University, as an expert in machine learning applications to computer vision, in particular: action recognition via discriminative deformable models, online activity recognition by deep learning, multilinear classifiers applied to gait identification and EEG classification, dimensionality reduction and metric learning for dynamical models.
He is a member of the Senior Program Committee of UAI and IJCAI's PC. He is Associate Editor of IEEE Transactions on Fuzzy Systems, the top journal in computer science by 2015 impact factor, and Guest Editor of the International Journal of Approximate Reasoning. He was previously in the editorial board of IEEE Systems Man and Cybernetics C and Elsevier’s Information Fusion journal. Cuzzolin has been in the PC of more than 60 international conferences, including UAI, IJCAI, ISIPTA, ECSQARU, BMVC, ACCV, IEEE SMC, IPMU, FLAIRS, and others.
He is currently the author of some 90 publications, including 2 monographs, an edited volume, 3 book chapters, some 20 journals papers (most of which in the top CS journals by impact factor such as IEEE Fuzzy Systems, IEEE Cybernetics, IEEE Pattern Analysis and Machine Intelligence, the International Journal of Computer Vision). His work has won several awards in recent years, including best paper at PRICAI’08 (the Pacific Rim Conference on AI), best poster at the 2012 INRIA’s Machine Learning summer school, Poster Prize at ISIPTA’11 (the International Symposium on Imprecise Probabilities and Their Applications), outstanding reviewer at BMVC’12 (the British Machine Vision Conference). In October 2012 he received the Next 10 Award, awarded to the Faculty of Technology’s top emerging researchers.
Click here for a detailed list of publications.
The geometry of uncertainty - The geometry of imprecise probabilities
Artificial Intelligence: Foundations, Theory, and Algorithms (http://www.springer.com/series/13900)
Springer-Verlag, June 2016 (working draft)
Visions of a generalized probability theory
Lambert Academic Publishing, September 2014
A mathematical theory of evidence
Princeton University Press, April 1976
Fabio Cuzzolin (Editor)
Proceedings of the 3rd International Conference on Belief Functions (BELIEF 2014)
Springer-Verlag, Lecture Notes in Artificial Intelligence (LNAI/LNCS)
Volume 8764, September 2014