One essential role of EML metadata is in precisely defining the units in which data is measured. To make sure these units can be understood by (and thus potentially converted to other units by) a computer, it’s necessary to be rather precise about our choice of units. EML knows about a lot of commonly used units, referred to as “standardUnits,” already. If a unit is in EML’s standardUnit
dictionary, we can refer to it without further explanation as long as we’re careful to use the precise id
for that unit, as we will see below.
Sometimes data involves a unit that is not in standardUnit
dictionary. In this case, the metadata must provide additional information about the unit, including how to convert the unit into the SI system. EML uses an existing standard, stmml, to represent this information, which must be given in the additionalMetadata
section of an EML file. The stmml
standard is also used to specify EML’s own standardUnit
definitions.
custom_units <-
data.frame(id = "speciesPerSquareMeter",
unitType = "arealDensity",
parentSI = "numberPerSquareMeter",
multiplierToSI = 1,
description = "number of species per square meter")
unitList <- set_unitList(custom_units)
Start with a minimal EML document
me <- list(individualName = list(givenName = "Carl", surName = "Boettiger"))
my_eml <- list(dataset = list(
title = "A Minimal Valid EML Dataset",
creator = me,
contact = me),
additionalMetadata = list(metadata = list(
unitList = unitList
))
)
## NULL
## [1] TRUE
## attr(,"errors")
## character(0)
Note: Custom units are widely misunderstood and misused in EML. See examples from custom-units
Let us start by examining the numeric attributes in an example EML file. First we read in the file:
f <- system.file("tests", emld::eml_version(), "eml-datasetWithUnits.xml", package = "emld")
eml <- read_eml(f)
We extract the attributeList
, and examine the numeric attributes (e.g. those which have units):