graph4lg
The rationale of graph4lg
package in R is to make easier the construction and analysis of genetic and landscape graphs in landscape genetic studies (hence the name graph4lg
, meaning Graphs for Landscape Genetics). This package provides users with tools for:
Each one of the included tutorials focuses on one of these points. This third tutorial will focus on landscape graph construction and analysis. It will describe the package functions allowing users to:
NOTA BENE: The package graph4lg
integrates functions making possible the construction and analysis of landscape graphs but most of them are only wrappers that launch computations with Graphab software.
Graphab is an open source software tool dedicated to the modelling of landscape networks (Foltête, Clauzel, and Vuidel 2012). It was developed in ThéMA laboratory (Besançon, France) to make possible the creation of landscape graphs and to integrate a complete set of connectivity analysis functions in a single application.
Integrating Graphab functionalities into graph4lg
aims at facilitating the import of Graphab output into R for subsequent analyses, including the comparison of genetic graphs and landscape graphs. All the functions calling Graphab software are described in graph4lg
help files. However, users are invited to read the complete documentation manuals as well as other resources on Graphab project website. Some computations are not possible when using graph4lg
but are possible when using directly Graphab software which can be downloaded for free on the same website. Foltête, Clauzel, and Vuidel (2012) paper also provides users with a synthetic description of the tool.
The function get_graphab
from graph4lg
package downloads automatically Graphab in the user’s machine, provided Java is installed on the machine. If it has already been downloaded, a message is displayed. Once Graphab has been downloaded in a machine, it is not necessary to download it again.
In the table below, we present all the functions from grapha4lg
calling Graphab software:
Function | Description |
---|---|
get_graphab |
Check if Graphab software (graphab-2.4.jar ) is in the user’s machine and downloads it if necessary, provided Java is installed. |
graphab_project |
Creates a Graphab project from a raster file (.tif) |
graphab_link |
Creates a set of links (Euclidean or least cost paths, planar or complete) |
graphab_graph |
Creates a graph from a set of links |
graphab_metric |
Computes graph-theoretic metrics from a graph |
graphab_modul |
Partition the graph into modules |
graphab_pointset |
Imports a set of points in the project and relates them to habitat patches and corresponding metrics |
get_graphab_metric |
Imports a table with the metrics computed and the patch properties |
get_graphab_linkset |
Imports a table with the link properties for a given link set |
graphab_to_igraph |
Imports a graph created with Graphab as an igraph graph object with node and link attributes |
Here, we do not rely on genetic data sets and will only use Graphab projects already created for the vignettes.
When using Graphab, input data consists of an 8-bit raster file (.tif) with discrete cell values. One (or several) of these values will correspond to habitat cells on the raster. Contiguous habitat cells will form habitat patches (8-neighbourhood criterion), as soon as they exceed the minimum patch area when specified
This is the first step of habitat network modelling with Graphab. At this stage, a project is created. This project is given a name (proj_name
), which will be the name of a directory containing geographical information layers relative to the project as well as a .xml file (“proj_name.xml”), which is the main project file.
The function graphab_project
creates the project directory called proj_name
either in the current working directory (default) or in the directory whose path is given as a proj_path
argument. The function takes as arguments:
proj_name
: Name of the project (project and .xml file directory)raster
: Path to the raster file on which habitat patches are identifiedhabitat
: Cell value(s) corresponding to habitat patches (possibly a vector with several values (e.g. c(1,2
))minarea
(optional): Minimum habitat patch size in hectaresnodata
(optional): Cell value corresponding to no data values, thereafter ignoredalloc_ram
(optional): Integer or numeric value indicating RAM gigabytes allocated to the java process. Increasing this value can speed up the computations. Too large values may not be compatible with user’s machine settingsproj_path
(optional): Path to the directory that contains the project directory. It should be used when the project directory is not in the current working directory. Default is NULL
. When 'proj_path = NULL'
, the project directory is equal to getwd()
.The two latter arguments are included in almost every function calling Graphab.
As an example, we will create a landscape graph from a simulated landscape raster with 5 land use categories (see map below).
load(file = paste0(system.file('extdata', package = 'graph4lg'),
"/", "rast_simul50.RDa"))
r.spdf <- as(rast, "SpatialPixelsDataFrame")
r.df <- as.data.frame(r.spdf)
r.df$layer <- as.factor(r.df$rast_simul50)
g <- ggplot(r.df, aes(x=x, y=y)) + geom_tile(aes(fill = layer)) + coord_equal()+
theme_bw()+
#scale_fill_brewer(palette="Dark2")+
scale_fill_manual(values = c("#396D35", "#FB9013", "#EDC951", "#80C342", "black", "#396D35"),
labels = c("0 - Forest", "1 - Shrublands", "2 - Crops",
"3 - Grasslands","4 - Artificial areas", "5 - Forest"),
name = "Land use type")+
labs(x="Longitude",y="Latitude")
g
Habitat patches will correspond to contiguous patches of raster cells equal to 0 or 5, as soon as they are larger than 200 hectares. They correspond to forests.
The project has been created and we will now create a link set between the habitat patches. We will consider least cost paths between habitat patches on the raster. To that purpose, we need to create a data.frame
specifying the cost values associated with every raster cell value. In the following example, these values are:
Code | Description | Cost value |
---|---|---|
0 | Forests (habitat) | 1 |
1 | Shrublands | 5 |
2 | Crops | 60 |
3 | Grasslands | 40 |
4 | Artificial areas | 1000 |
5 | Forest (habitat) | 1 |
They are stored into the cost
object (data.frame
):
The link set is created with graphab_link
function with the following arguments:
proj_name
: Name of the projectdistance
: Type of distance computed. By default, distance="cost"
, meaning that cost distances are computed between patches according the cost values specified in cost
data.frame. Alternatively, distance="euclid"
, meaning that straight line Euclidean distances are computed.name
: A character string indicating the name of link setcost
: A data.frame
with the cost values (cost
in our example)topo
: A character string indicating the topology of the created link set. It can be planar (topo='planar'
(default)). It speeds up the computation but will prevent from creating complete graphs with graphab_graph
, or it can be complete (topo='complete'
).See more details in the help file (?graphab_link
).
In our example, we create a planar link set in cost-distances:
graphab_link(proj_name = proj_name,
distance = "cost",
cost = cost,
name = "lkst1",
topo = "planar")
NOTA BENE: Several link sets can be created in the same project.
We can now create a landscape graph with graphab_graph
. This function takes as arguments:
proj_name
: Name of the projectlinkset
: Name of the link set used to create the graphname
: A character string indicating the name of the graph created, if only one is createdthr
: An integer or numeric value indicating the maximum distance associated with the links of the created graph. It allows users to create a pruned graph based on a distance threshold. Note that when the link set used has a planar topology, the graph is necessarily a pruned graph (not complete) and adding this threshold parameter can remove other links. When the link set has been created with cost-distances, the parameter is expressed in cost-distance units whereas when the link set is based upon Euclidean distances, the parameter is expressed in meters.cost_conv
: Logical (TRUE
or FALSE
) indicating whether numeric thr
values are converted from cost-distance into Euclidean distance using a log-log linear regression. See also convert_cd
function presented in the first tutorial.proj_name
is the only mandatory argument. Without any other argument, a non-thresholded graph is created for every link set present in the project. Names are created automatically.
In our example, we can create a planar graph with the following command:
Once the graph has been created, it can be analysed by computing graph-theoretic connectivity metrics or by partitioning its nodes.
Many connectivity metrics can be computed from a landscape graph (see Baranyi et al. (2011) and Rayfield, Fortin, and Fall (2011) among other reviews on the subject). Graphab software includes a large range of connectivity metrics and its manual provides users with a comprehensive description of every one of them (see Graphab 2.4 manual).
The function graphab_metric
computes these metrics. It takes as arguments:
proj_name
: Name of the project
graph
: Name of the graph on which the metric is computed
metric
: Name of the metric among:
"PC"
),"IIC"
),"F"
),"IF"
),"Dg"
),"CCe"
),"CF"
)"dPC"
).dist
: A numeric or integer value specifying the distance at which dispersal probability is equal to prob
. This argument is mandatory for weighted metrics (PC, F, IF, BC, dPC, CCe, CF) but not used for others. It is used to set \(\alpha\) for computing dispersal probabilities associated with all inter-patch distances such that dispersal probability between patches \(i\) and \(j\) is \(p_{ij}= e^{-\alpha d_{ij}}\).
prob
: A numeric or integer value specifying the dispersal probability at distance dist
. By default, code=0.05
. It is used to set \(\alpha\) (see argument dist
above).
beta
: A numeric or integer value between 0 and 1 specifying the exponent associated with patch areas in the computation of metrics weighted by patch area. By default, beta=1
. When beta=0
, patch areas do not have any influence in the computation.
cost_conv
: Logical (TRUE
or FALSE
) indicating whether numeric dist
values are converted from cost-distance into Euclidean distance using a log-log linear regression.
return_val
: Logical (default = TRUE
) indicating whether metric values are returned in R (TRUE
) or only stored in the patch attribute layer (FALSE
)
Metrics fall into different categories. PC and IIC are global metrics and take only one value for the entire graph, which is returned in R environment when return_val=TRUE
.
The other metrics are computed at the node level. When return_val=TRUE
, a data.frame
is returned in R environment specifying the value of the metric for every graph node.
The description of every metric is beyond the scope of this tutorial. We again invite users to read the help file (?graphab_metric
) as well as the Graphab 2.4 user manual.
We will compute metrics on the graph "graph"
in our example. First, we will compute the probability of connectivity at the global level. We set the dist
and prob
parameter such that: \(p(10km)=e^{-\alpha \times d_{ij}}=0.05\). We convert 10 km in cost-distance units for the computation.
# Global metric: PC
graphab_metric(proj_name = proj_name,
graph = "graph",
metric = "PC",
dist = 10000,
prob = 0.05,
beta = 1,
cost_conv = TRUE)
#> [1] "Project : graphab_example" "Graph : graph"
#> [3] "Metric : PC" "Dist : 10000"
#> [5] "Prob : 0.05" "Beta : 1"
#> [7] "Value : 0.000903573300776615"
We obtain the value in an object of class list
.
Now, using the same parameters, we compute the local metric Flux.
f <- graphab_metric(proj_name = proj_name,
graph = "graph",
metric = "F",
dist = 10000,
prob = 0.05,
beta = 1,
cost_conv = FALSE)
#> [[1]]
#> [1] "Project : graphab_example" "Graph : graph"
#> [3] "Metric : F" "Dist : 10000"
#> [5] "Prob : 0.05" "Beta : 1"
Every time a local metric is computed, it can be returned in R environment (here stored in f
) but is also stored in the habitat patch attribute table, such that at the end this table contains every metric computed. We will see later that it can be very useful.
Another way to analyse landscape graph connectivity and topology is to make a partitioning. The principle is the same as that of modularity analyses of genetic graphs described in the second tutorial.
graphab_modul
function carries out such an analysis, but before we perform some modifications in the current code, we recommend doing this analysis directly in Graphab software.
For the moment, it only creates a shapefile polygon layer with the Voronoi polygons corresponding to the groups of patches forming modules.
It takes as arguments:
proj_name
: Name of the projectgraph
: Name of the graph on which the partition is performeddist
: A numeric or integer value specifying the distance at which dispersal probability is equal to prob
. This argument is used to weight the links when modularity is computed.prob
: A numeric or integer value specifying the dispersal probability at distance dist
. By default, code=0.05
. It is used to set \(\alpha\) (see argument dist
above).beta
: A numeric or integer value between 0 and 1 specifying the exponent associated with patch areas in the computation of the modularity. By default, beta=1
. When beta=0
, patch areas do not have any influence in the computation.nb
: Optional argument indicating the number of modules to create. By default, the number of modules maximising the modularity index is used.For example:
NOTA BENE: All these analyses can be performed for several graphs created in the same project, provided their name is correctly indicated in the function arguments.
Landscape graphs have been created and analysed by computing connectivity metrics and/or by partitioning them. The output of these analyses can be used in landscape genetic analyses, e.g. to assess the relationship between genetic variables measured in populations inhabiting the study landscape and the connectivity of its habitat patches. To do that, two approaches are here presented:
Imagine that we have sampled several populations in the forests of the study landscape. We want to know in which landscape graph node lives every population, to get the corresponding connectivity metrics and compare them with the genetic response.
To do that, the function graphab_pointset
imports a set of points into the Graphab project. It takes as arguments:
proj_name
: Name of the project
linkset
: Name of the link set used to compute the distance from each point to the nearest patch (in cost-distance or Euclidean distance units depending on the way the link set was created)
pointset
: it can be either:
data.frame
with three columns: ID, x and y, respectively indicating the point ID, longitude and latitude.SpatialPointsDataFrame
Point coordinates must be in the same coordinate reference system as the habitat patches (and initial raster layer). The function returns a data.frame
with the properties of the nearest patch to every point in the point set, as well as the distance from each point to the nearest patch.
For example, we can add the point data.frame
pts_pop_simul
to the project with the following command.
Note that when pointset
is not the path to a shapefile point layer, such a layer will be created in the temp files of the user’s machine. It is therefore preferable to give the path to a shapefile layer as a pointset
argument.
Users can also import in R environment the link set properties in a data.frame
with an edge list format with the function get_graphab_linkset
which takes as arguments the project name and the link set name.
Similarly, users can import the table in which are stored all the metrics computed in the project as well as the node properties (including their areas). The function get_graphab_metric
takes as argument the project name.
Finally, users can also create a graph from a Graphab link set and convert it into a graph object of class igraph
. The imported graph has weighted links. Nodes attributes present in the Graphab project are included, including connectivity metrics when computed. A figure can be displayed automatically representing the graph on a map. The graph is given the topology of the selected link set.
The function graphab_to_igraph
takes as arguments:
proj_name
: Name of the projectlinkset
: Name of the link set used to compute the distance from each point to the nearest patch (in cost-distance or Euclidean distance units depending on the way the link set was created)nodes
: A character string indicating whether the nodes of the created graph are given all the attributes (nodes="patches"
, default) or metrics computed in Graphab or only those specific to a given graph previously created (nodes="graph name"
).weight
: A character string (weight="euclid"
or weight="cost"
) indicating whether to weight the links with Euclidean distance or cost-distance (default) values.fig
: Logical (TRUE
or FALSE
) indicating whether to plot a figure of the resulting spatial graph with plot_graph_lg
.crds
: Logical (TRUE
or FALSE
) indicating whether to create an object of class data.frame
with the node centroid spatial coordinates.land_graph <- graphab_to_igraph(proj_name = proj_name,
linkset = "lkst1",
nodes = "patches",
weight = "cost",
fig = TRUE,
crds = TRUE)
crds_patches <- land_graph[[2]]
land_graph <- land_graph[[1]]
The function returns a list of two objects:
land_graph[[1]]
)land_graph[[2]]
)This graph can be plotted on a map with plot_graph_lg
with node sizes proportional to habitat patch area and link width inversely proportional to cost-distances:
We have seen in this third tutorial how to construct, analyse and import landscape graphs with Graphab and graph4lg
. In the next and last tutorial, we will see how to compare genetic graphs and landscape graphs.
Baranyi, Gabriella, Santiago Saura, János Podani, and Ferenc Jordán. 2011. “Contribution of Habitat Patches to Network Connectivity: Redundancy and Uniqueness of Topological Indices.” Ecological Indicators 11 (5): 1301–10.
Foltête, Jean-Christophe, Céline Clauzel, and Gilles Vuidel. 2012. “A Software Tool Dedicated to the Modelling of Landscape Networks.” Environmental Modelling & Software 38: 316–27.
Rayfield, Bronwyn, Marie-Josée Fortin, and Andrew Fall. 2011. “Connectivity for Conservation: A Framework to Classify Network Measures.” Ecology 92 (4): 847–58.