This short document analyses the climate change dataset originally presented in the hyperdirichlet R package1 but using the hyper2 package instead. Lay perception of climate change is a complex and interesting process2 here we assess the engagement of non-experts by the use of “icons” (this word is standard in this context. An icon is a “representative symbol”) that illustrate different impacts of climate change.
Here we analyse results of one experiment3, in which subjects were presented with a set of icons of climate change and asked to identify which of them they find most concerning. Six icons were used: PB [polar bears, which face extinction through loss of ice floe hunting grounds], NB [the Norfolk Broads, which flood due to intense rainfall events], L [London flooding, as a result of sea level rise], THC [the thermo-haline circulation, which may slow or stop as a result of anthropogenic modification of the water cycle], OA [oceanic acidification as a result of anthropogenic emissions of \({\mathrm C}{\mathrm O}_2\)], and WAIS [the West Antarctic Ice Sheet, which is rapidly calving as a result of climate change].
Methodological constraints dictated that each respondent could be presented with a maximum of four icons. The R idiom below (dataset icons
in the package) shows the experimental results.
M <- matrix(c(
5 , 3 , NA, 4, NA, 3,
3 , NA, 5, 8, NA, 2,
NA, 4, 9, 2, NA, 1,
10, 3, NA, 3, 4, NA,
4 , NA, 5, 6, 3, NA,
NA, 4, 3, 1, 3, NA,
5 , 1, NA, NA, 1, 2,
5 , NA, 1, NA, 1, 1,
NA, 9, 7, NA, 2, 0)
, byrow=TRUE,ncol=6)
colnames(M) <- c("NB","L","PB","THC","OA","WAIS")
M
## NB L PB THC OA WAIS
## [1,] 5 3 NA 4 NA 3
## [2,] 3 NA 5 8 NA 2
## [3,] NA 4 9 2 NA 1
## [4,] 10 3 NA 3 4 NA
## [5,] 4 NA 5 6 3 NA
## [6,] NA 4 3 1 3 NA
## [7,] 5 1 NA NA 1 2
## [8,] 5 NA 1 NA 1 1
## [9,] NA 9 7 NA 2 0
(M is called icons_matrix
in the package). Each row of M
corresponds to a particular cue given to respondents. The first row, for example, means that a total of \(5+3+4+3=15\) people were shown icons NB, L, TCH, WAIS [column names of the the non-NA entries]; 5 people chose NB as most concerning, 3 chose L, and so on. The dataset is more fully described in the hyperdirichlet package. To recreate the icons
likelihood function in the hyper2
package, we use the saffy()
function:
library("hyper2",quietly=TRUE)
icons <- saffy(M)
icons
## NB^32 * (NB + L + THC + OA)^-20 * (NB + L + THC + WAIS)^-15 * (NB
## + L + OA + WAIS)^-9 * (NB + PB + THC + OA)^-18 * (NB + PB + THC +
## WAIS)^-18 * (NB + PB + OA + WAIS)^-8 * L^24 * (L + PB + THC +
## OA)^-11 * (L + PB + THC + WAIS)^-16 * (L + PB + OA + WAIS)^-18 *
## PB^30 * THC^24 * OA^14 * WAIS^9
At this point, the icons
object as created above is mathematically identical to icons
object in the hyperdirichlet package (and indeed the hyper2 package), but the terms might appear in a different order.
The first step is to find the maximum likelihood estimate for the icons
likelihood:
mic <- maxp(icons)
mic
## NB L PB THC OA WAIS
## 0.25230411 0.17364433 0.22458188 0.17011281 0.11068604 0.06867083
dotchart(mic,pch=16)
We also need the log-likelihood at an unconstrained maximum:
L1 <- loglik(icons,indep(mic))
L1
## [1] -174.9974
Agreeing to 4 decimal places with the value given in the hyperdirichlet package.
The next step is to assess a number of hypotheses concerning the relative sizes of \(p_1\) through \(p_6\).
Following the analysis in Hankin 2010, we first observe that NB [the Norfolk Broads] is the largest-probability icon (as expected on theoretical grounds by O’Neill). This would suggest that \(p_1\) is large, in some sense. Consider the hypothesis that \(p_1\leqslant 1/6\) and to that end perform a constrained optimization, with (active) constraint that \(p_1\leqslant 1/6\). This is most easily done by imposing additional constraints to the maxp()
function via the fcm
and fcv
arguments:
small <- 1e-6 # ensure start at an interior point
maxlike_constrained <-
maxp(icons, startp=indep(equalp(icons))-c(small,0,0,0,0),
give=TRUE, fcm=c(-1,0,0,0,0),fcv= -1/6)
maxlike_constrained
## $par
## [1] 0.1666667 0.1916197 0.2681047 0.1835885 0.1190362
##
## $value
## [1] -177.6602
##
## $counts
## function gradient
## 763 109
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $outer.iterations
## [1] 3
##
## $barrier.value
## [1] 0.00015538
Note that the value of \(p_1\) is equal to \(1/6\): the maximum is on the boundary of the admissible region. The extra support is given by
ES <- L1-maxlike_constrained$value
ES
## [1] 2.662764
(compare 2.608181 from the hyperdirichlet package). This exceeds Edwards’s two-units-of-support criterion; a \(p\)-value would be
pchisq(2*ES,df=1,lower.tail=FALSE)
## [1] 0.02101524
both of which would indicate that we may reject that hypothesis that \(p_1\leqslant 1/6\) and thereby infer \(p_1>\frac{1}{6}\).
Another constrained likelihood maximization, although this one is not possible with convex constraints. We have to compare L1
against the maximum likelihood over the region defined by \(p_1\) being greater than at least one of \(p_2,\ldots,p_6\). The union of convex sets is not necessarily convex (think two-way Venn diagrams). As far as I can see, the only way to do it is to perform a sequence of five constrained optimizations: \(p_1\leqslant p_2, p_1\leqslant p_3, p_1\leqslant p_4, p_1\leqslant p_5\). The fillup constraint would be \(p_1\leqslant p_6\longrightarrow 2p_1+p_2+\ldots p_5\leqslant 1\). We then choose the largest likelihood from the five.
o <- function(Ul,Cl,startp,give=FALSE){
small <- 1e-6 # ensure start at an interior point
if(missing(startp)){startp <- small*(1:5)+rep(0.1,5)}
out <- maxp(icons, startp=small*(1:5)+rep(0.1,5), give=TRUE, fcm=Ul,fcv=Cl)
if(give){
return(out)
}else{
return(out$value)
}
}
p2max <- o(c(-1, 1, 0, 0, 0), 0)
p3max <- o(c(-1, 0, 1, 0, 0), 0)
p4max <- o(c(-1, 0, 0, 1, 0), 0)
p5max <- o(c(-1, 0, 0, 0, 1), 0)
p6max <- o(c(-2,-1,-1,-1,-1),-1)
(the final line is different because \(p_6\) is the fillup value).
likes <- c(p2max,p3max,p4max,p5max,p6max)
likes
## [1] -175.8237 -175.0836 -175.9986 -178.2043 -181.4944
ml <- max(likes)
ml
## [1] -175.0836
Thus the first element of likes
corresponds to the maximum likelihood, constrained so that \(p_1\leqslant p_2\); the second element corresponds to the constraint that \(p_1\leqslant p_3\), and so on. The largest likelihood is the easiest constraint to break, in this case \(p_1\leqslant p_3\): this makes sense because \(p_3\) has the second highest MLE after \(p_1\). The extra likelihood is given by
L1-ml
## [1] 0.08613913
(the hyperdirichlet package gives 0.0853 here, a suprisingly small discrepancy given the difficulties of optimizing over a nonconvex region). We conclude that there is no evidence for \(p_1\geqslant\max\left(p_2,\ldots,p_6\right)\).
It’s worth looking at the evaluate too:
o(c(-1, 1, 0, 0, 0), 0,give=TRUE)$par
## [1] 0.2120674 0.2120674 0.2302719 0.1649972 0.1121906
o(c(-1, 0, 1, 0, 0), 0,give=TRUE)$par
## [1] 0.2377191 0.1736393 0.2377192 0.1718361 0.1100634
o(c(-1, 0, 0, 1, 0), 0,give=TRUE)$par
## [1] 0.2109081 0.1787235 0.2230625 0.2109081 0.1071554
o(c(-1, 0, 0, 0, 1), 0,give=TRUE)$par
## [1] 0.1809502 0.1764378 0.2285099 0.1651894 0.1809502
o(c(-2,-1,-1,-1,-1),-1,give=TRUE)$par
## [1] 0.1585836 0.1776468 0.2326462 0.1646608 0.1078789
The next hypothesis was testing the smallness of \(p_5+p_6\), and we suspected that \(p_5+p_6 < \frac{1}{3}\). This translates into optimizing subject to \(p_5+p_6\geqslant\frac{1}{3}\), and the required constraint is \(-p_1-p_2-p_3-p_4\geqslant-\frac{2}{3}\) (because \(p_5+p_6=1-p_1-p_2-p_3-p_4\)). Dead easy:
jj <- o(c(-1,-1,-1,-1,0) , -2/3, give=TRUE,start=indep((1:6)/21))$value
jj
## [1] -182.2108
then the extra support is
L1-jj
## [1] 7.213359
(compare 7.711396 in hyperdirichlet, not sure why the discrepancy is so large).
The final example was \(\max\left\{p_5,p_6\right\}\geqslant\min\left\{p_1,p_2,p_3,p_4\right\}\). This means the optimization is constrained so that at least one of \(\left\{p_5,p_6\right\}\) exceeds at least one of \(\left\{p_1,p_2,p_3,p_4\right\}\). So we have the union of the various possibilities:
\[ \bigcup_{j\in\left\{5,6\right\}\atop k\in\left\{1,2,3,4\right\}} \left\{\left(p_1,p_2,p_3,p_4,p_5,p_6\right)\left|\sum p_i=1, p_j\geqslant p_k\right.\right\} \]
and of course \(p_6\geqslant p_2\), say, translates to \(-p_1-2p_2-p_3-p_4-p_5\geqslant -1\).
start <- indep(c(small,small,small,small,0.5-2*small,0.5-2*small))
jj <- c(
o(c(-1, 0, 0, 0, 1), 0,start=start),
o(c( 0,-1, 0, 0, 1), 0,start=start),
o(c( 0, 0,-1, 0, 1), 0,start=start),
o(c( 0, 0, 0,-1, 1), 0,start=start),
o(c(-2,-1,-1,-1,-1),-1,start=start),
o(c(-1,-2,-1,-1,-1),-1,start=start),
o(c(-1,-1,-2,-1,-1),-1,start=start),
o(c(-1,-1,-1,-2,-1),-1,start=start)
)
jj
## [1] -178.2043 -175.9429 -177.3242 -175.7738 -181.4944 -177.9784 -180.4066
## [8] -177.7537
max(jj)
## [1] -175.7738
So the extra support is
L1-max(jj)
## [1] 0.7763474
(compare hyperdirichlet which gives 3.16, not sure why the difference). We should look at the maximum value:
o(c( 0, 0, 0,-1, 1), 0,give=TRUE,start=start)
## $par
## [1] 0.2531001 0.1702956 0.2177958 0.1448582 0.1448582
##
## $value
## [1] -175.7738
##
## $counts
## function gradient
## 464 73
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $outer.iterations
## [1] 2
##
## $barrier.value
## [1] 0.0001725539
So the evaluate is at the boundary, for \(p_4=p_5\). I have no explanation for the discrepancy between this and that in the hyperdirichlet package.
RKS Hankin 2010. “A generalization of the Dirichlet distribution”, Journal of Statistical Software, 33:11↩
SC Moser and L Dilling 2007. “Creating a climate for change: communicating climate change and facilitating social change”. Cambridge University Press↩
S. O’Neill 2007. “An Iconic Approach to Representing Climate Change”, School of Environmental Science, University of East Anglia↩