Thematic choropleth maps are used to display quantities of some variable within areas, such as mapping median income across a city’s neighborhoods. However, we often think in bivariate terms - “how do race and income vary together?”. Maps that captures this, known as bivariate choropleth maps, are often perceived as difficult to create and interpret. The goal of biscale
is to implement a consistent approach to bivariate mapping entirely within R
. The package’s workflow is based on a recent tutorial written by Timo Grossenbacher and Angelo Zehr, and supports both two-by-two and three-by-three bivariate maps.
Since the package does not directly use functions from sf
, it is a suggested dependency rather than a required one. However, the most direct approach to using biscale
is with sf
objects, and we therefore recommend users install sf
before proceeding with using biscale
. Windows users should be able to install sf
without significant issues, but macOS and Linux users will need to install several open source spatial libraries to get sf
itself up and running. The easiest approach for macOS users is to install the GDAL 2.0 Complete framework from Kyng Chaos.
For Linux users, steps will vary based on the flavor being used. Our configuration file for Travis CI and its associated bash script should be useful in determining the necessary components to install.
Once sf
is installed, the easiest way to get biscale
is to install it from CRAN:
Alternatively, the development version of biscale
can be accessed from GitHub with remotes
:
All functions within biscale
use the prefix bi_
to leverage the auto-completion features of RStudio and other IDEs.
biscale
contains a data set of U.S. Census tracts for the City of St. Louis in Missouri. Both median income and the percentage of white residents are included, both of which can be used to demonstrate the package’s functionality.
Once data are loaded, bivariate classes can be applied with the bi_class()
function:
> # load dependencies
> library(biscale)
>
> # create classes
> data <- bi_class(stl_race_income, x = pctWhite, y = medInc, style = "quantile", dim = 3)
Note that, as of v0.2 of the biscale
package, sf
is imported when you load biscale
. This resolves issues related to not loading sf
ahead of time, though it does add a dependency that a small number of users may have wished not to install.
The dim
argument is used to control the extent of the legend - do you want to produce a two-by-two map (dim = 2
) or a three-by-three map (dim = 3
)?
Classes can be applied with the style
parameter using four approaches for calculating breaks: "quantile"
(default), "equal"
, "fisher"
, and "jenks"
. The default "quantile"
approach will create relatively equal “buckets” of data for mapping, with a break created at the median (50th percentile) for a two-by-two map or at the 33rd and 66th percentiles for a three-by-three map.
With the sample data, this creates a very broad range for the percent white measure in particular. Using one of the other approaches to calculating breaks yields a narrower range for the breaks and produces a map that does not overstate the percent of white residents living on the north side of St. Louis:
Once breaks are created, we can use bi_scale_fill()
as part of our ggplot()
call:
# create map
map <- ggplot() +
geom_sf(data = data, mapping = aes(fill = bi_class), color = "white", size = 0.1, show.legend = FALSE) +
bi_scale_fill(pal = "DkBlue", dim = 3) +
labs(
title = "Race and Income in St. Louis, MO",
subtitle = "Dark Blue (DkBlue) Palette"
) +
bi_theme()
This requires that the variable bi_class
, created with bi_class()
, is used as the fill variable in the aesthetic mapping. We also want to remove the legend from the plot since it will not accurately communicate the complexity of the bivariate scale.
The dimensions of the scale must again be supplied for bi_scale_fill()
(they should match the dimensions given for bi_class()
!), and a palette must be given. Options for palettes are "Brown"
, "DkBlue"
, "DkCyan"
, "DkViolet"
, or "GrPink"
. The "DkViolet"
was created by Timo Grossenbacher and Angelo Zehr, and the other four palettes were created by Joshua Stevens. The first map in this vignette uses the "GrPink"
palette. Other two-by-two palettes look like so:
Here are several other three-by-three palettes:
The "Brown"
palette is not pictured, but can be previewed (along with the other palettes) using the bi_pal()
function. The same function can also be used to return vectors of color hex values as well. Samples of each palette are available on the package website.
The example above also includes bi_theme()
, which is based on the theme designed by Timo Grossenbacher and Angelo Zehr. This theme creates a simple, clean canvas for bivariate mapping that removes any possible distracting elements.
Note that it is also possible to apply custom color palettes as well using the bi_pal_manual()
function
We’ve set show.legend = FALSE
so that we can add (manually) our own bivariate legend. The legend itself can be created with the bi_legend()
function:
legend <- bi_legend(pal = "DkBlue",
dim = 3,
xlab = "Higher % White ",
ylab = "Higher Income ",
size = 8)
The palette and dimensions should match what has been used for both bi_class()
(in terms of dimensions) and bi_scale_fill()
(in terms of both dimensions and palette). The size
argument controls the font size used on the legend. Note that plotmath
is used to draw the arrows since Unicode arrows are font dependent. This happens internally as part of bi_legend()
- you don’t need to include them in your xlab
and ylab
arguments!
With our legend drawn, we can then combine the legend and the map with a package like cowplot
. The values needed for this stage will be subject to experimentation depending on the shape of the map itself.
# combine map with legend
finalPlot <- ggdraw() +
draw_plot(map, 0, 0, 1, 1) +
draw_plot(legend, 0.2, .65, 0.2, 0.2)
This approach allows us to customize legend’s placement and size to suit different map layouts. All of the maps shown as part of this vignette were produced using this approach. There are other approaches you could take as well that do not use cowplot
. To maintain backward compatibility, cowplot
is no longer a suggested dependency of biscale
.
R
itself, welcome! Hadley Wickham’s R for Data Science is an excellent way to get started with data manipulation in the tidyverse, which biscale
is designed to integrate seamlessly with.R
, we strongly encourage you check out the excellent new Geocomputation in R by Robin Lovelace, Jakub Nowosad, and Jannes Muenchow.biscale
, you are encouraged to use the RStudio Community forums. Please create a reprex
before posting. Feel free to tag Chris (@chris.prener
) in any posts about biscale
.reprex
and then open an issue on GitHub.If you have features or suggestions you want to see implemented, please open an issue on GitHub (and ideally created a reprex
to go with it!). Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.