This R package provides miscellaneous tools for Finnish open government data. Your contributions, bug reports and other feedback are welcome!
Installation (Asennus)
Finnish provinces (Maakuntatason informaatio)
Finnish municipalities (Kuntatason informaatio)
Finnish personal identification number (HETU) (Henkilotunnuksen kasittely)
Visualization tools (Visualisointirutiineja)
See also other rOpenGov packages, in particular:
We assume you have installed R. If you use RStudio, change the default encoding to UTF-8. Linux users should also install CURL.
Install the stable release version in R:
install.packages("sorvi")
Test the installation by loading the library:
library(sorvi)
Development version for developers:
library(devtools)
install_github("ropengov/sorvi")
We also recommend setting the UTF-8 encoding:
Sys.setlocale(locale="UTF-8")
Brief examples of the package tools are provided below. Further examples are available in Louhos-blog and in our Rmarkdown blog.
Finnish municipality information is available through Statistics Finland (Tilastokeskus; see stafi package) and Land Survey Finland (Maanmittauslaitos). The row names for each data set are harmonized and can be used to match data sets from different sources, as different data sets may carry different versions of certain municipality names.
Source: Maanmittauslaitos, MML.
municipality.info.mml <- get_municipality_info_mml()
library(knitr)
kable(municipality.info.mml[1:2,])
Map all municipalities to correponding provinces
m2p <- municipality_to_province()
kable(head(m2p)) # Just show the first ones
Map selected municipalities to correponding provinces:
municipality_to_province(c("Helsinki", "Tampere", "Turku"))
Speed up conversion with predefined info table:
m2p <- municipality_to_province(c("Helsinki", "Tampere", "Turku"), municipality.info.mml)
kable(head(m2p))
Municipality name to code
convert_municipality_codes(municipalities = c("Turku", "Tampere"))
Municipality codes to names
convert_municipality_codes(ids = c(853, 837))
Complete conversion table
municipality_ids <- convert_municipality_codes()
kable(head(municipality_ids)) # just show the first entries
Generic conversion of synonymes into harmonized terms.
First, get a synonyme-name mapping table. In this example we harmonize Finnish municipality names that have multiple versions. But the synonyme list can be arbitrary.
f <- system.file("extdata/municipality_synonymes.csv", package = "sorvi")
synonymes <- read.csv(f, sep = "\t")
Validate the synonyme list and add lowercase versions of the terms:
synonymes <- check_synonymes(synonymes, include.lowercase = TRUE)
Convert the given terms from synonymes to the harmonized names:
harmonized <- harmonize_names(c("Mantta", "Koski.Tl"), synonymes)
kable(harmonized)
Extracting information from a Finnish personal identification number
library(sorvi)
hetu("111111-111C")
## hetu gender personal.number checksum date day month year
## 1 111111-111C Male 111 C 1911-11-11 11 11 1911
## century.char
## 1 -
The function accepts also vectors as input, returning a data frame:
library(knitr)
kable(hetu(c("010101-0101", "111111-111C")))
hetu | gender | personal.number | checksum | date | day | month | year | century.char |
---|---|---|---|---|---|---|---|---|
010101-0101 | Female | 10 | 1 | 1901-01-01 | 1 | 1 | 1901 | - |
111111-111C | Male | 111 | C | 1911-11-11 | 11 | 11 | 1911 | - |
Extracting specific field
hetu(c("010101-0101", "111111-111C"), extract = "gender")
## [1] "Female" "Male"
Validate Finnish personal identification number:
valid_hetu("010101-0101") # TRUE/FALSE
## [1] TRUE
Draw regression curve with smoothed error bars based on the Visually-Weighted Regression by Solomon M. Hsiang. The sorvi implementation extends Felix Schonbrodt’s original code.
library(sorvi)
data(iris)
p <- regression_plot(Sepal.Length ~ Sepal.Width, iris)
print(p)
This work can be freely used, modified and distributed under the Two-clause BSD license.
citation("sorvi")
##
## Kindly cite the sorvi R package as follows:
##
## (C) Leo Lahti, Juuso Parkkinen, Joona Lehtomaki, Juuso Haapanen,
## Einari Happonen and Jussi Paananen (rOpenGov 2010-2015). sorvi:
## Finnish open data toolkit for R. URL:
## http://ropengov.github.com/sorvi
##
## A BibTeX entry for LaTeX users is
##
## @Misc{,
## title = {sorvi: Finnish open government data toolkit for R},
## author = {Leo Lahti and Juuso Parkkinen and Joona Lehtomaki and Juuso Haapanen and Einari Happonen and Jussi Paananen},
## doi = {10.5281/zenodo.10280},
## year = {2011},
## }
##
## Many thanks for all contributors! See: http://ropengov.github.com
This vignette was created with
sessionInfo()
## R version 3.2.1 (2015-06-18)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## Running under: Ubuntu 15.04
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.10.5 sorvi_0.7.26
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.11.6 magrittr_1.5 MASS_7.3-41
## [4] munsell_0.4.2 colorspace_1.2-6 R6_2.0.1
## [7] highr_0.5 stringr_1.0.0 plyr_1.8.3
## [10] dplyr_0.4.2 tools_3.2.1 parallel_3.2.1
## [13] grid_3.2.1 gtable_0.1.2 DBI_0.3.1
## [16] htmltools_0.2.6 lazyeval_0.1.10 yaml_2.1.13
## [19] assertthat_0.1 digest_0.6.8 RColorBrewer_1.1-2
## [22] reshape2_1.4.1 ggplot2_1.0.1 formatR_1.2
## [25] evaluate_0.7 rmarkdown_0.7 labeling_0.3
## [28] stringi_0.5-2 scales_0.2.5 proto_0.3-10