This vignette introduces the basic functionality of the assignR package using data bundled with the package. We’ll review how to access compiled data for known-origin biological samples and environmental models, use these to fit and apply functions estimating the probability of sample origin across a study region, and summarize these results to answer research and conservation questions. We’ll also demonstrate an assignment quality analysis tool useful in study design, method comparison, and uncertainty analysis.
Let’s load the package
Now add some data from the package to your local environment. Load and plot the North America boundary mask.
Load and plot a growing season precipitation H isoscape for North America. Notice this is a RasterBrick with two layers, the mean prediction and a standard deviation of the prediction. The layers are from waterisotopes.org, and resolution has been reduced to speed up processing in these examples. Full-resolution global growing season isoscapes are included in the package as d2h_world.rda and d18o_world.rda.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
The package includes a database of H and O isotope data for known origin samples, knownOrig.rda. Let’s load it and have a look. First we’ll get the names of the data fields available in the database.
## [1] "d2H" "d18O" "Taxon" "Group"
## [5] "Source_quality" "Age_code" "Reference" "ID"
Now lets look at a list of species names available.
## [1] Danaus plexippus Setophaga ruticilla
## [3] Turdus migratorius Setophaga coronata auduboni
## [5] Poecile atricapillus Thryomanes bewickii
## [7] Thryothorus ludovicianus Spizella passerina
## [9] Geothlypis trichas Setophaga pensylvanica
## [11] Baeolophus bicolor Vermivora chrysoptera
## [13] Catharus guttatus Setophaga citrina
## [15] Geothlypis formosa Geothlypis tolmiei
## [17] Oreothlypis ruficapilla Cardinalis cardinalis
## [19] Oreothlypis celata Junco hyemalis oregonus
## [21] Seiurus aurocapilla Vireo olivaceus
## [23] Melospiza melodia Catharus ustulatus
## [25] Catharus fuscescens Vireo griseus
## [27] Cardellina pusilla Hylocichla mustelina
## [29] Icteria virens Setophaga petechia
## [31] Melozone aberti Vermivora cyanoptera
## [33] Passer domesticus Aimophila ruficeps
## [35] Poecile carolinensis Troglodytes aedon
## [37] Dumetella carolinensis Mniotilta varia
## [39] Lanius ludovicianus Anthus spragueii
## [41] Euphagus carolinus Empidonax minimus
## [43] Aythya affinis Oreothlypis peregrina
## [45] Cyanistes caeruleus Phasianus colchicus
## [47] Lagopus lagopus Tetrao tetrix
## [49] Dryocopus maritus Serin serin
## [51] Vanellus vanellus Corvus corone
## [53] Turdus merula Corvus monedula
## [55] Columba palumbus Turtle Dove
## [57] Tetrastes bonasia Perdix perdix
## [59] Anas platyrhyncos Branta canadensis
## [61] Columba livia Numenius arguata
## [63] Turdus pilaris Turdus iliacus
## [65] Turdus philomelos Fringilla coelebs
## [67] Buteo lagopus Accipiter striatus
## [69] Falco sparverius Accipiter gentillis
## [71] Accipiter cooperii Buteo jamaicensis
## [73] Buteo platypterus Buteo swainsoni
## [75] Circus cyaneus Falco columbarius
## [77] Falco mexicanus Falco perigrinus
## [79] Wilsonia citrina Oporornis tolmiei
## [81] Wilsonia pusilla Homo sapiens
## [83] Charadrius montanus
## 83 Levels: Accipiter cooperii Accipiter gentillis ... Wilsonia pusilla
Load H isotope data for North American Loggerhead Shrike from the package database. Here we are limiting the data to values from one publication…comparability of H isotope measurements across different labs and methods is often questionable.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## 524 data points are found
For a real application you would want to explore the knownOrig.rda dataset to find measurements that are appropriate to your study system (same or similar taxon, geographic region, measurement approach, etc.) or collect and import known-origin data that are specific to your system.
We need to start by assessing how the environmental (precipitation) isoscape values correlate with the sample values. calRaster fits a linear model relating the precipitation isoscape values to sample values, and applies it to produce a sample-type specific isoscape.
## Warning in spTransform(xSP, CRSobj, ...): NULL source CRS comment, falling back
## to PROJ string
## Warning in spTransform(xSP, CRSobj, ...): NULL target CRS comment, falling back
## to PROJ string
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in calRaster(known = d, isoscape = d2h_lrNA, mask = naMap): known was
## reprojected
## Warning in spTransform(xSP, CRSobj, ...): NULL source CRS comment, falling back
## to PROJ string
## Warning in spTransform(xSP, CRSobj, ...): NULL target CRS comment, falling back
## to PROJ string
## Warning in calRaster(known = d, isoscape = d2h_lrNA, mask = naMap): mask was
## reprojected
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in proj4string(x): CRS object has comment, which is lost in output
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
##
##
## ---------------------------------------
## ------------------------------------------
## rescale function uses linear regression model,
## the summary of this model is:
## -------------------------------------------
## --------------------------------------
##
## Call:
## lm(formula = tissue.iso ~ isoscape.iso[, 1])
##
## Residuals:
## Min 1Q Median 3Q Max
## -83.321 -9.287 0.198 10.774 60.327
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.92452 1.77787 0.52 0.603
## isoscape.iso[, 1] 1.11204 0.03966 28.04 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.14 on 522 degrees of freedom
## Multiple R-squared: 0.601, Adjusted R-squared: 0.6002
## F-statistic: 786.2 on 1 and 522 DF, p-value: < 2.2e-16
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
Let’s create some hypothetical sample IDs and values to demonstrate how samples of unknown origin can be assigned to the calibrated isoscape. The isotope values are drawn from a random distribution with a standard deviation of 8 per mil, which is a pretty reasonable variance for conspecific residents at a single location. If you had real measured data for your study samples you would load them here, instead.
## id d2H
## 1 a -114.48381
## 2 b -111.84142
## 3 c -97.53033
## 4 d -109.43593
## 5 e -108.96570
Produce posterior probability density maps used to the assign the unknown origin samples. For reference on the Bayesian inversion method see Wunder, 2010
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
Cell values in these maps are small because each cell’s value represents the probability that this one cell, out of all of them on the map, is the actual origin of the sample. Together, all cell values on the map should sum to ‘1’, reflecting the assumption that the sample originated somewhere in the study area. Let’s check this for sample ‘a’.
## [1] 1
Check out the help page for pdRaster for additional options, including the use of informative prior probabilities.
The oddsRatio tool compares the posterior probabilities for two different locations or regions. This might be useful in answering real-world questions…for example “is this sample more likely from France or Spain?”, or “how likely is this hypothesized location relative to other possibilities?”.
Let’s compare probabilities for two spatial areas - the states of Utah and New Mexico. First we’ll load the SpatialPolygons and plot them.
data("states")
s1 = states[states$STATE_ABBR == "UT",]
s2 = states[states$STATE_ABBR == "NM",]
plot(naMap)
lines(s1, col = c("red"))
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
Get the odds ratio for the two regions. The result reports the odds ratio for the regions (first relative to second) for each of the 5 unknown samples plus the ratio of the areas of the regions. If the isotope values (& prior) were completely uninformative the odds ratios would equal the ratio of areas.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in proj4string(inputP): CRS object has comment, which is lost in output
## Warning in proj4string(inputP): CRS object has comment, which is lost in output
## Warning in proj4string(x): CRS object has comment, which is lost in output
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in proj4string(x): CRS object has comment, which is lost in output
## Warning in proj4string(x): CRS object has comment, which is lost in output
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in proj4string(x): CRS object has comment, which is lost in output
## Warning in proj4string(x): CRS object has comment, which is lost in output
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## $`P1/P2 odds ratio`
## a b c d e
## 186.90811 137.70244 28.00606 104.55897 99.11234
##
## $`Ratio of numbers of cells in two polygons`
## [1] 2
Here you can see that even though Utah is quite a bit smaller the isotopic evidence suggests it’s much more likely to be the origin of each sample. This result is consistent with what you might infer from a first-order comparison of the state map with the posterior probability maps, above.
Comparisons can also be made using points. Let’s create two points (one in each of the Plover regions) and compare their odds. This result also shows the odds ratio for each point relative to the most- and least-likely grid cells on the posterior probability map.
pp1 = c(-112,40)
pp2 = c(-105,33)
pp12 = SpatialPoints(coords = rbind(pp1,pp2))
proj4string(pp12) = proj4string(naMap)
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in proj4string(inputP): CRS object has comment, which is lost in output
## Warning in proj4string(inputP): CRS object has comment, which is lost in output
## Warning in proj4string(x): CRS object has comment, which is lost in output
## $`P1/P2 odds ratio`
## a b c d e
## 712.93331 465.11041 45.99972 315.28453 292.20970
##
## $`Odds relative to the max/min pixel`
## ratioToMax.a ratioToMax.b ratioToMax.c ratioToMax.d ratioToMax.e
## P1 0.7189405387 0.72920526 1.06869138 0.864856467 0.8676131
## P2 0.0009917117 0.00168752 0.02429042 0.002549864 0.0029879
## ratioToMin.a ratioToMin.b ratioToMin.c ratioToMin.d ratioToMin.e
## P1 64948687.05 300012.94 15346648 32221284.945 14559753.86
## P2 36947.43 28318.93 2194383 1049.077 42906.92
The odds of the first point being the location of origin are pretty high for each sample, and much higher than for the second point.
Researchers often want to classify their study area in to regions that are and are not likely to be the origin of the sample (effectively ‘assigning’ the sample to a part of the area). This requires choosing a subjective threshold to define how much of the study domain is represented in the assignment region. qtlRaster offers two choices.
Extract 10% of the study area, giving maps that show the 10% of grid cells with the highest posterior probability for each sample.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## class : RasterStack
## dimensions : 17, 38, 646, 5 (nrow, ncol, ncell, nlayers)
## resolution : 2.999999, 2.999999 (x, y)
## extent : -165, -51.00005, 20.58329, 71.58327 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs
## names : a, b, c, d, e
## min values : 0, 0, 0, 0, 0
## max values : 1, 1, 1, 1, 1
Extract 80% of the posterior probability density, giving maps that show the smallest region within which there is an 80% chance each sample originated.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## class : RasterStack
## dimensions : 17, 38, 646, 5 (nrow, ncol, ncell, nlayers)
## resolution : 2.999999, 2.999999 (x, y)
## extent : -165, -51.00005, 20.58329, 71.58327 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs
## names : a, b, c, d, e
## min values : 0, 0, 0, 0, 0
## max values : 1, 1, 1, 1, 1
Comparing the two results, the probability-based assignment regions are broader. This suggests that we’ll need to assign to more than 10% of the study area if we want to correctly assign 80% or more of our samples. We’ll revisit this below and see how we can chose thresholds that are as specific as possible while achieving a desired level of assignment ‘quality’.
Most studies involve assigning multiple individuals, and often it is desirable to summarize the results from these individuals. jointP and unionP offer two options for summarizing posterior probabilities from multiple samples.
Calculate the probability that all samples came from any given grid cell in the analysis area. Note that this summarization will only be useful if all samples are truly derived from a single population of common geographic origin.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## class : RasterLayer
## dimensions : 17, 38, 646 (nrow, ncol, ncell)
## resolution : 2.999999, 2.999999 (x, y)
## extent : -165, -51.00005, 20.58329, 71.58327 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs
## source : memory
## names : Joint_Probability
## values : 7.806487e-38, 0.0206668 (min, max)
Calculate the probability that any sample came from any given grid cell in the analysis area. In this case we’ll save the output to a variable for later use.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
The results from unionP highlight a broader region, as you might expect.
Any of the other post-hoc analysis tools can be applied to the summarized results. Here we’ll use qtlRaster to identify the 10% of the study area that is most likely to be the origin of one or more samples.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## class : RasterLayer
## dimensions : 17, 38, 646 (nrow, ncol, ncell)
## resolution : 2.999999, 2.999999 (x, y)
## extent : -165, -51.00005, 20.58329, 71.58327 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs
## source : memory
## names : layer
## values : 0, 1 (min, max)
How good are the geographic assignments? What area or probability threshold should be used? Is it better to use isoscape A or B for my analysis? These questions can be answered through split-sample validation using QA.
We will run quality assessment on the known-origin dataset and precipitation isoscape. These analyses take some time to run, depending on the number of stations and iterations used (this one took about two minutes on my desktop PC).
##
|
| | 0%
|
|================== | 25%
|
|=================================== | 50%
|
|==================================================== | 75%
|
|======================================================================| 100%
Plot the result. (Please note that because of changes in R’s random number generator your results may not exactly match those shown here if you are using R version 3.5.X.)
The first three panels show three metrics, granularity (higher is better), bias (closer to 1:1 is better), and sensitivity (higher is better). The second plot shows the posterior probabilities at the known locations of origin relative to random (=1, higher is better). More information is provided in Ma et al., in review.
A researcher might refer to the sensitivity plot, for example, to assess what qtlRaster area threshold would be required to obtain 90% correct assignments in their study system. Here it’s somewhere between 0.25 and 0.3.
How would using a different isoscape or different known origin dataset affect the analysis? Multiple QA objects can be compared to make these types of assessments.
Let’s modify our isoscape to add some random noise.
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum WGS_1984 in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded datum Unknown based on WGS84 ellipsoid in CRS definition,
## but +towgs84= values preserved
We’ll combine the fuzzy isoscape with the uncertainty layer from the original isoscape, then rerun QA using the new version. Obviously this is not something you’d do in real work, but as an example it allows us to ask the question “how would the quality of my assignments change if my isoscape predictions were of reduced quality?”.
d2h_fuzzy = brick(d2h_fuzzy, d2h_lrNA[[2]])
qa2 = QA(d2h_fuzzy, d, valiStation = 8, valiTime = 4, by = 5, mask = naMap, name = "fuzzy")
##
|
| | 0%
|
|================== | 25%
|
|=================================== | 50%
|
|==================================================== | 75%
|
|======================================================================| 100%
Now plot to compare.
## NULL
Assignments made using the fuzzy isoscape are generally poorer than those made without fuzzing. Hopefully that’s not a surprise, but you might encounter cases where decisions about how to design your project or conduct your data analysis do have previously unknown or unexpected consequences. These types of comparisons can help reveal them!
Questions or comments? gabe.bowen@utah.edu