Introduction to Belief Functions

Introduction

What is a belief function? In this vignette, an example is given to show the difference between a belief function and a probability function. Then I use a small problem, the Monty Hall Game, to show the use of some functions of the package. Specifically, I will show:

how to define a mass function m on some subsets of the set of possible values of a variable A (function bca);
how to specify a relation r between three variables A, B, C by way of a mass function m on some subsets of the set of possible values of the product space AxBxC (function bcaRel);
how to extend a mass function to a larger space in order to combine it with another mass function defined on this larger space (function extmin);
how to combine two mass functions (functions dsrwon and nzdsr);
how obtain a reduced space by eliminating a variable from a relation (marginalization, with function elim.R);
how to compute the measures of belief and plausibility (functions belplau and tabresul).

To learn more about belief functions, see the book of Glenn Shafer ¹.

A simple example of a belief function

Next August, I plan to spend a few days in the National Park of Forillon. I have already been there many years ago and had to set my tent under heavy rain, after 10 hours of driving. Not good. This time, I decided to check Canadian Weather and look at the last year’s statistics for the month of August before booking.

Unfortunately, there is no historical data about the number of sunny days in a month. Currently, the website gives the quantity of rain each day. Looking at five past years, I count a median number of 14 days of rain or 45% of the days of the month. So, I start my analysis with this information as a probability distribution: (rain: 45%, no rain: 55%).

What can I find out about the sun? I will use the statistics on rain to establish a compatibility relation between rain and sun. If there is rain, there is, generally, no sun. So I put a mass value of 0.45 on the event “no sun”. On the other side, no rain does not mean sun; it can also be cloudy. Hence “no rain” is compatible with the event {“sun”, “no sun”}, which receive a mass value of 0.55.

I have just defined a belief function by using a known probability distribution on some situation to extend it to another situation where probabilities are unknown. I can now use function bca to encode the events of interest and their mass value. Next, I look at the measures of belief and plausibility with function tabresul.

# Evidence for sun
Weather <-  bca(f= matrix(c(1,0,0,1,1,1), ncol=2, byrow=TRUE), m= c(0, 0.45, 0.55), cnames =c("Sun", "NoSun"), infovarnames = "Weather", varnb = 1)
# The belief function of Weather
tabresul(Weather)
#> $mbp
#>       Sun NoSun mass Belief Plausibility Plty Ratio
#> Sun     1     0 0.00   0.00         0.55   0.550000
#> NoSun   0     1 0.45   0.45         1.00   1.818182
#> frame   1     1 0.55   1.00         1.00        Inf
#> 
#> $Conflict
#> [1] 0

We don’t have a probability distribution here; only one of the two elementary events has received a mass value; The elementary event “no sun” has a degree of support (belief) of 0.45 and the elementary event “sun” has a degree of support of 0. The remaining mass of 0.55 has been allotted to the frame \({'Sun', 'NoSun'}\). This is the expression of the part of ignorance that remains.

The odds of “sun” are not very good at 0.55. Maybe look at July for my vacation instead of August? Another story.

If we want to stay with probability theory, we can apply a transformation to the plausibility distribution of the elementary events to obtain a probability distribution. This is called the plausibility transformation.

plautrans(Weather)
#>       Sun NoSun    trplau
#> Sun     1     0 0.3548387
#> NoSun   0     1 0.6451613

Finally, starting from statistics of rain, we end with a probability of sun 0f 0.35. The odds Sun/NoSun are at 0.55. Note that this is the plausibility ratio given by function tabresul.

The Monty Hall Game

Let us recall the Monty Hall Game from its statement in the Wikipedia article on the subject (Monty Hall problem):

"Suppose you're on a game show. Three doors A, B and C are in front of you. Behind one door is a brand new car, and behind the two others, a goat. You are asked to pick one of the three doors. Then the host of the game, who knows what's behind the doors, opens one of the two remaining doors and shows a goat. He then asks you: "Do you want to switch doors or keep your initial choice?"

Say you have chosen the door A and the host has opened door B. The question now is: Is it to your advantage to switch your choice from door A to door C?"

Some notation to begin with

For each door A, B, C, consider the same frame F with three possible values: \[F = \{car, goat 1, goat 2\}.\] I use (0,1)-vectors to identify each element of the frame. Hence, the element “car” is identified by the vector \((1, 0, 0)\), goat 1 by the vector \((0, 1, 0)\) and goat 2 by the vector \((0, 0, 1)\).

With this notation, any subset of F can has a unique (0,1) representation. For example, the subset \(\{goat 1, goat 2\}\) is represented by the vector \((0, 1, 1)\).

Analysis of the problem

We have three things to consider:

how the three doors are linked;
evidence pertaining to door A (choice of the contestant);
evidence pertaining to door B (the action of the host).

1. How the three doors are linked

There are six possible ways of disposing of the car and the two goats behind the three doors A, B, C: \(\{car, goat 1, goat 2\}\) \(\{car, goat 2, goat 1\}\) \(\{goat 1, car, goat 2\}\) \(\{goat 1, goat 2, car\}\) \(\{goat 2, car, goat 1\}\) \(\{goat 2, goat 1, car\}\).

This information will be encoded in the product space AxBxC. The frame, or set of possible values F_ABC of AxBxC is the product of the frames of A, B and C. Thus, its number of elements is \(3^3 = 27\). The six possible dispositions of the car and goats determine a subset S of the frame F_ABC. A mass of 1 will be allotted to this subset S.

We use the function bcaRel to establish the desired relation between the doors. We prepare the necessary inputs and call the function.

# 1. define the tt matrix MHABC_tt, which encodes the subset S
# 
MHABC_tt <- matrix(c(1,0,0,0,1,0,0,0,1,
                     1,0,0,0,0,1,0,1,0,
                     0,1,0,1,0,0,0,0,1,
                     0,1,0,0,0,1,1,0,0,
                     0,0,1,1,0,0,0,1,0,
                     0,0,1,0,1,0,1,0,0), ncol=9, byrow=TRUE)
colnames(MHABC_tt) <- rep(c("car", "g1", "g2"), 3)
#
# 2. define the spec matrix. 
# Here we have one subset of six elements
# 
MHABC_spec = matrix(rep(1,12), ncol = 2, dimnames = list(NULL, c("specnb", "mass"))) 
# 
# 3. define the info matrix. 
# for each variable, we attribute a number and give the size of the frame
# 
MHABC_info =matrix(c(1:3, rep(3,3)), ncol = 2, dimnames = list(NULL, c("varnb", "size")) )
# 
# 4. call of the function with the name of the variables and the numbering of the relation
# 
MHABC_rel <-  bcaRel(tt = MHABC_tt, spec = MHABC_spec, infovar = MHABC_info, infovarnames = c("MHA", "MHB", "MHC"), relnb = 1)
# 
# let's see some results:
# Note that row labels of the resulting tt matrix can become pretty long. 
bcaPrint(MHABC_rel)
#>                                                               MHABC_rel specnb
#> 1 car g1 g2 + car g2 g1 + g1 car g2 + g1 g2 car + g2 car g1 + g2 g1 car      1
#>   mass
#> 1    1

2. Evidence pertaining to door A

You have chosen door A. At this point, the problem is quite simple. Your belief is equally divided between 3 possible outcomes: car, goat 1 or goat 2: \(m({car}) = m({g1}) = m({g2}) = 1/3\). Let’s encode this evidence with function bca.

# Evidence for door A
MHA_E <-  bca(f= diag(1,3,3), m= rep(1/3, 3), cnames =c("car", "g1", "g2"), infovarnames = "MHA", varnb = 1)
# At this point, no big deal...
bcaPrint(MHA_E)
#>   MHA_E specnb              mass
#> 1   car      1 0.333333333333333
#> 2    g1      2 0.333333333333333
#> 3    g2      3 0.333333333333333

3. Evidence pertaining to door B

But the host wanted to add some thrill to the game. He has opened door B and revealed a goat. The host has given us a small piece of evidence: Goat 1 or goat 2 was behind door B. Since the host knows what is behind each door, the mass value of this piece of evidence is: \(m({g1, g2}) = 1\).

Let’s translate this in R with function bca:

# Evidence for door B
MHB_E <- bca(f= matrix(c(0,1,1), ncol=3, byrow = TRUE), m=1, cnames =c("car", "g1", "g2"), infovarnames = "MHB" , varnb=2)
bcaPrint(MHB_E)
#>     MHB_E specnb mass
#> 1 g1 + g2      1    1

The hypergraph of the Monty Hall game

We now look at The Monty Hall game as a belief network. Variables A, B and C are the nodes of the graph. The edges (hyperedges) are the evidences MHA_E AND MHB_E and the relation MHABC_rel. We use the package igraph ² to produce a bipartite graph corresponding to the desired hypergraph.

# The network
if (requireNamespace("igraph", quietly = TRUE) ) {
library(igraph)
# Encode pieces of evidence and relations with an incidence matrix
Monty_hgm <- matrix(c(1,1,1,1,0,0,0,1,0), ncol=3, dimnames = list(c("A", "B", "C"), c("r_ABC", "ev_A", "ev_B")))
# The graph structure
Monty_hg <- graph_from_incidence_matrix(incidence = Monty_hgm, directed = FALSE, multiple = FALSE, weighted = NULL,add.names = NULL)
V(Monty_hg)
# Show variables as circles, relations and evidence as rectangles
V(Monty_hg)$shape <- c("circle", "crectangle")[V(Monty_hg)$type+1]
V(Monty_hg)$label.cex <- 0.6
V(Monty_hg)$label.font <- 2
# render graph
plot(Monty_hg, vertex.label = V(Monty_hg)$name, vertex.size=(4+4*V(Monty_hg)$type)*8)
}

Now, the solution with the calculus of belief functions

To obtain the Belief function for door C, we have to combine evidence for doors A and B with the relation on doors ABC. We will proceed by elimination of the doors (variables), eliminating A first, then B.

1. Eliminate A from the network

To do that, we extend Evidence on A to ABC, combine the extended evidence with the relation on ABC, then marginalize on BC. This gives a reduced network with B and C

# Extend MHA to the product space AxBxC
MHA_ext <- extmin(MHA_E, MHABC_rel )
# Combine MHA_ext and MHABC_rel
MHA_ABC_comb <- dsrwon(MHA_ext,MHABC_rel)
MHA_ABC_comb$con
#> [1] 0
# Eliminate variable A ( since  the measure of contradiction is 0, no need to normalize)
MHBC <- elim(MHA_ABC_comb, xnb = 1)

The reduced belief network

Monty2_hgm <- matrix(c(1,1,1,0), ncol=2, dimnames = list(c("B", "C"), c("r_BC", "ev_B")))
Monty2_hg <- graph_from_incidence_matrix(incidence = Monty2_hgm, directed = FALSE, multiple = FALSE, weighted = NULL,add.names = NULL)
V(Monty2_hg)
#> + 4/4 vertices, named, from 8aec36e:
#> [1] B    C    r_BC ev_B
# Variables as circles, relations and evidence as rectangles
V(Monty2_hg)$shape <- c("circle","crectangle")[V(Monty2_hg)$type+1]
V(Monty2_hg)$label.cex <- 0.6
V(Monty2_hg)$label.font <- 2
# render graph
# plot(Monty_hg, vertex.size=40)
plot(Monty2_hg, vertex.label = V(Monty2_hg)$name, vertex.size=(4+4*V(Monty2_hg)$type)*8)

2. Eliminate B from the network

Similarly, we extend evidence on B to BC, combine the extended evidence with the relation on BC, then marginalize to C. This will give the final result.

# Extend MHB_E to the space BxC
MHB_ext <- extmin(MHB_E, MHBC )
MHB_BC_comb <- dsrwon(MHB_ext, MHBC)
MHB_BC_comb$con
#> [1] 0
# Eliminate variable B (since  MHA_BC_comb$con = 0, no need to normalize)
MHC <- elim(MHB_BC_comb, xnb = 2)
belplau(MHC)
#>            Belief Plausibility Plty Ratio
#> g1 + g2 0.3333333    0.3333333        0.5
#> car     0.6666667    0.6666667        2.0

Conclusion:

As we can see, we double our chances of winning the car if we switch from door A to door C.

Note that there is no loss of generality by fixing the choices in the analysis (door A for the contestant, door B for the host).

To be more specific and make a bridge with probability theory, we can add to our result all the elementary events that have 0 mass, so that we can see their measure of plausibility.

The function addTobca serves this purpose.

MHC_plus_singl <- addTobca(MHC, f=matrix(c(0,1,0,0,0,1), ncol = 3, byrow = TRUE))
tabresul(MHC_plus_singl)
#> $mbp
#>         car g1 g2      mass    Belief Plausibility Plty Ratio
#> g1        0  1  0 0.0000000 0.0000000    0.3333333  0.3333333
#> g2        0  0  1 0.0000000 0.0000000    0.3333333  0.3333333
#> g1 + g2   0  1  1 0.3333333 0.3333333    0.3333333  0.5000000
#> car       1  0  0 0.6666667 0.6666667    0.6666667  2.0000000
#> 
#> $Conflict
#> [1] 0

Shafer, G., (1976). A Mathematical Theory of Evidence. Princeton University Press, Princeton, New Jersey. 297 pp.↩
Csardi G, Nepusz T: The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. http://igraph.org ↩