The funprog package provides high-order functions for data manipulation (grouping, sorting, …). It takes its inspiration from other pure functional programming languages (hence the name).
You can install funprog from the CRAN with :
To get the development version, you can install it from gitlab :
Currently, the main functions are all inspired by Haskell functions.
group_if
group_if
splits a vector or a list into groups, given a predicate function.
The predicate is a binary function returning a boolean, applied to every couple of adjacent elements. If it evaluates to TRUE
, those elements belong to the same group, otherwise they belong to different groups.
x1 <- c(2, 4, 2, 2, 1, 1, 1, 3)
str(group_eq(x1)) # shortcut for group_if(x1, `==`)
#> List of 5
#> $ : num 2
#> $ : num 4
#> $ : num [1:2] 2 2
#> $ : num [1:3] 1 1 1
#> $ : num 3
str(group_if(x1, `<=`)) # returns non decreasing sequences
#> List of 3
#> $ : num [1:2] 2 4
#> $ : num [1:2] 2 2
#> $ : num [1:4] 1 1 1 3
str(group_if(x1, function(x, y) abs(x - y) > 1))
#> List of 5
#> $ : num [1:3] 2 4 2
#> $ : num 2
#> $ : num 1
#> $ : num 1
#> $ : num [1:2] 1 3
group_if
is inspired by Haskell’s groupBy function.
%on%
%on%
is a convenient operator that combines a binary function and a unary function to create a binary function.
Formally, (f %on% g)(x, y)
is defined as f(g(x), g(y))
. For instance :
It may be helpful to create a easy-to-read predicates, particularly in conjunction with group_if
:
x2 <- c(2, 4, 2, -2, -1, 1, 1, 3)
str(group_if(x2, `==` %on% abs))
#> List of 5
#> $ : num 2
#> $ : num 4
#> $ : num [1:2] 2 -2
#> $ : num [1:3] -1 1 1
#> $ : num 3
x3 <- list(1:3, 1:3, 3:5, 1, 2)
str(group_if(x3, `==` %on% length))
#> List of 2
#> $ :List of 3
#> ..$ : int [1:3] 1 2 3
#> ..$ : int [1:3] 1 2 3
#> ..$ : int [1:3] 3 4 5
#> $ :List of 2
#> ..$ : num 1
#> ..$ : num 2
%on%
is inspired by Haskell’s on function.
sort_by
sort_by
sorts a vector or a list, not on values itselves, but on a transformation of those values.
Additional functions can be used for breaking ties, as well as decreasing order.
str(sort_by(list(1:2, 3:4, 5), length, descending(sum)))
#> List of 3
#> $ : num 5
#> $ : int [1:2] 3 4
#> $ : int [1:2] 1 2
sort_by
is inspired by Haskell’s sortBy function.
sort_by
and group_if
can work together well to perform grouping operations. The next example row-binds data.frames sharing the same column names :
library(dplyr)
dfs <- list(
data.frame(A = 0:1, B = c("a", "b")),
data.frame(C = 3, D = 4, E = 5),
data.frame(A = 1),
data.frame(A = 3, B = "c"),
data.frame(C = 5:6, D = 7:8, E = 10:11)
)
dfs %>%
# sort_by to make data.frames sharing same names adjacent
sort_by(function(x) paste(names(x), collapse = "#")) %>%
# group_if on identical column names
group_if(identical %on% names) %>%
# concatenate data.frames belonging to the same group
lapply(bind_rows)
#> [[1]]
#> A
#> 1 1
#>
#> [[2]]
#> A B
#> 1 0 a
#> 2 1 b
#> 3 3 c
#>
#> [[3]]
#> C D E
#> 1 3 4 5
#> 2 5 7 10
#> 3 6 8 11
If you have installed the purrr package, you can use the shortcut syntax for specifying arguments that are functions :
x1 <- c(2, 4, 2, 2, 1, 1, 1, 3)
str(group_if(x1, ~ abs(.x - .y) > 1))
#> List of 5
#> $ : num [1:3] 2 4 2
#> $ : num 2
#> $ : num 1
#> $ : num 1
#> $ : num [1:2] 1 3
x4 <- list(c(a = 1, b = 2, c = 3), c(a = 10, b = 20))
sort_by(x4, "b") # shortcut for `function(x) x[["b"]]`
#> [[1]]
#> a b c
#> 1 2 3
#>
#> [[2]]
#> a b
#> 10 20
sort_by(x4, descending(1)) # shortcut for `descending(function(x) x[[1]])`
#> [[1]]
#> a b
#> 10 20
#>
#> [[2]]
#> a b c
#> 1 2 3