Abstract

The mountainplot package provide an extension to the lattice package that allows for the consutruction of mountain plots, which are also known as folded empirical cumulative distribution plots.

Setup

Load the package and use the singer data from the lattice package. Combine the first and second parts of each voice part into a new variable called section.

library("mountainplot")
data(singer, package = "lattice")
parts <- within(singer, {
section <- voice.part
section <- gsub(" 1", "", section)
section <- gsub(" 2", "", section)
section <- factor(section)
})
# Change levels to logical ordering
levels(parts$section) <- c("Bass","Tenor","Alto","Soprano")

Mountain plot

A mountainplot, or folded empircal cumulative distribution function, is similar to an ordinary empirical CDF, but once the cumulative probability reaches 0.50, the CDF is inverted, decreasing back down instead of continuing upward.

Here is an example of the traditional empirical CDFs.

require(latticeExtra) # for ecdfplot
## Loading required package: latticeExtra
## Loading required package: lattice
## Loading required package: RColorBrewer
ecdfplot(~height|section, data = parts, groups=voice.part, type='l',
         layout=c(1,4),
         main="Empirical CDF",
         auto.key=list(columns=4), as.table=TRUE)

Here is a view of the same data shown with a mountain plot.

mountainplot(~height|section, data = parts,
             groups=voice.part, type='l',
             layout=c(1,4),
             main="Folded Empirical CDF",
             auto.key=list(columns=4), as.table=TRUE)

Monti (1995) suggests that a mountain plot is helpful with exploring data and makes it easier to:

  1. Determine the median.
  2. Determine the range.
  3. Determine central or tail percentiles of any specified value.
  4. Observe outliers.
  5. Observe unusual gaps in the data.
  6. Examine the data for symmetry.
  7. Compare multiple distributions.
  8. Visually examine the sample size.

Additionally, the area under the curve is equal to the mean absolute deviation (MAD) Xue and Titterington (2011).

Diabetic mice example

Huh (1995) developed at the same time the concept of the flipped empirical distribution function. The following code creates a mountainplot of Hand’s diabetic mice data, which can be compared to Huh’s version.

dmice <- data.frame(
  albumen=c(156,282,197,297,116,127,119,29,253,122,349,110,143,64,26,86,122,455,655,14,
          391,46,469,86,174,133,13,499,168,62,127,276,176,146,108,276,50,73,
          82,100,98,150,243,68,228,131,73,18,20,100,72,133,465,40,46,34, 44),
  group=c(rep('normal',20), rep('alloxan', 18), rep('insulin', 19))
)
mountainplot(~albumen, data=dmice, group=group, auto.key=list(columns=3),
             main="Diabetic mice", xlab="Nitrogen-bound bovine serum albumen")

Session information

sessionInfo()
## R version 3.4.0 (2017-04-21)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 7 x64 (build 7601) Service Pack 1
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=C                           LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] latticeExtra_0.6-28 RColorBrewer_1.1-2  lattice_0.20-35     mountainplot_1.2   
## [5] knitr_1.16         
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.11    digest_0.6.12   rprojroot_1.2   grid_3.4.0      backports_1.1.0
##  [6] magrittr_1.5    evaluate_0.10   stringi_1.1.5   rmarkdown_1.5   tools_3.4.0    
## [11] stringr_1.2.0   yaml_2.1.14     compiler_3.4.0  htmltools_0.3.6

Bibliography

Huh, Moon Yul. 1995. “Exploring Multidimensional Data with the Flipped Empirical Distribution Function.” Journal of Computational and Graphical Statistics 4 (4): 335–43. doi:10.2307/1390860.

Monti, K.L. 1995. “Folded Empirical Distribution Function Curves-Mountain Plots.” American Statistician 49: 342–45. doi:10.1080/00031305.1995.10476179.

Xue, Jing-Hao, and D Michael Titterington. 2011. “The P-Folded Cumulative Distribution Function and the Mean Absolute Deviation from the P-Quantile.” Statistics & Probability Letters 81 (8): 1179–82. doi:10.1016/j.spl.2011.03.014.