A Walk Through a Slice of Combinatorics in R*
Below, we examine packages equipped with the capabilities of generating combinations & permutations. If I have left out any package, please forgive me and please leave a comment or better yet, edit this post.
Outline of analysis:
- Introduction
- Combinations
- Permutations
- Multisets
- Summary
- Memory
Before we begin, we note that combinations/permutations with replacement of distinct vs. non-distint items chosen m at a time are equivalent. This is so, because when we have replacement, it is not specific. Thus, no matter how many times a particular element originally occurs, the output will have an instance(s) of that element repeated 1 to m times.
1. Introduction
gtools
v 3.8.1
combinat
v 0.0-8
multicool
v 0.1-10
partitions
v 1.9-19
RcppAlgos
v 2.0.1 (I am the author)
arrangements
v 1.1.0
gRbase
v 1.8-3
I did not include permute
, permutations
, or gRbase::aperm/ar_perm
as they are not really meant to attack these types of problems.
|--------------------------------------- OVERVIEW ----------------------------------------|
|_______________| gtools | combinat | multicool | partitions |
| comb rep | Yes | | | |
| comb NO rep | Yes | Yes | | |
| perm rep | Yes | | | |
| perm NO rep | Yes | Yes | Yes | Yes |
| perm multiset | | | Yes | |
| comb multiset | | | | |
|accepts factors| | Yes | | |
| m at a time | Yes | Yes/No | | |
|general vector | Yes | Yes | Yes | |
| iterable | | | Yes | |
|parallelizable | | | | |
| big integer | | | | |
|_______________| iterpc | arrangements | RcppAlgos | gRbase |
| comb rep | Yes | Yes | Yes | |
| comb NO rep | Yes | Yes | Yes | Yes |
| perm rep | Yes | Yes | Yes | |
| perm NO rep | Yes | Yes | Yes | * |
| perm multiset | Yes | Yes | Yes | |
| comb multiset | Yes | Yes | Yes | |
|accepts factors| | Yes | Yes | |
| m at a time | Yes | Yes | Yes | Yes |
|general vector | Yes | Yes | Yes | Yes |
| iterable | | Yes | Partially | |
|parallelizable | | Yes | Yes | |
| big integer | | Yes | | |
The tasks, m at a time
and general vector
, refer to the capability of generating results "m at a time" (when m is less than the length of the vector) and rearranging a "general vector" as opposed to 1:n
. In practice, we are generally concerned with finding rearrangements of a general vector, therefore all examinations below will reflect this (when possible).
All benchmarks were ran on 3 different set-ups.
- Macbook Pro i7 16Gb
- Macbook Air i5 4Gb
- Lenovo Running Windows 7 i5 8Gb
The listed results were obtained from setup #1 (i.e. MBPro). The results for the other two systems were similar. Also, gc()
is periodically called to ensure all memory is available (See ?gc
).
2. Combinations
First, we examine combinations without replacement chosen m at a time.
RcppAlgos
combinat
(or utils
)
gtools
arrangements
gRbase
How to:
library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(13)
testVector1 <- sort(sample(100, 17))
m <- 9
t1 <- comboGeneral(testVector1, m) ## returns matrix with m columns
t3 <- combinat::combn(testVector1, m) ## returns matrix with m rows
t4 <- gtools::combinations(17, m, testVector1) ## returns matrix with m columns
identical(t(t3), t4) ## must transpose to compare
#> [1] TRUE
t5 <- combinations(testVector1, m)
identical(t1, t5)
#> [1] TRUE
t6 <- gRbase::combnPrim(testVector1, m)
identical(t(t6)[do.call(order, as.data.frame(t(t6))),], t1)
#> [1] TRUE
Benchmark:
microbenchmark(cbRcppAlgos = comboGeneral(testVector1, m),
cbGRbase = gRbase::combnPrim(testVector1, m),
cbGtools = gtools::combinations(17, m, testVector1),
cbCombinat = combinat::combn(testVector1, m),
cbArrangements = combinations(17, m, testVector1),
unit = "relative")
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgos 1.064 1.079 1.160 1.012 1.086 2.318 100
#> cbGRbase 7.335 7.509 5.728 6.807 5.390 1.608 100
#> cbGtools 426.536 408.807 240.101 310.848 187.034 63.663 100
#> cbCombinat 97.756 97.586 60.406 75.415 46.391 41.089 100
#> cbArrangements 1.000 1.000 1.000 1.000 1.000 1.000 100
Now, we examine combinations with replacement chosen m at a time.
RcppAlgos
gtools
arrangements
How to:
library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(97)
testVector2 <- sort(rnorm(10))
m <- 8
t1 <- comboGeneral(testVector2, m, repetition = TRUE)
t3 <- gtools::combinations(10, m, testVector2, repeats.allowed = TRUE)
identical(t1, t3)
#> [1] TRUE
## arrangements
t4 <- combinations(testVector2, m, replace = TRUE)
identical(t1, t4)
#> [1] TRUE
Benchmark:
microbenchmark(cbRcppAlgos = comboGeneral(testVector2, m, TRUE),
cbGtools = gtools::combinations(10, m, testVector2, repeats.allowed = TRUE),
cbArrangements = combinations(testVector2, m, replace = TRUE),
unit = "relative")
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgos 1.000 1.000 1.000 1.000 1.000 1.00000 100
#> cbGtools 384.990 269.683 80.027 112.170 102.432 3.67517 100
#> cbArrangements 1.057 1.116 0.618 1.052 1.002 0.03638 100
3. Permutations
First, we examine permutations without replacement chosen m at a time.
RcppAlgos
gtools
arrangements
How to:
library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(101)
testVector3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29))
## RcppAlgos... permuteGeneral same as comboGeneral above
t1 <- permuteGeneral(testVector3, 6)
## gtools... permutations same as combinations above
t3 <- gtools::permutations(10, 6, testVector3)
identical(t1, t3)
#> [1] TRUE
## arrangements
t4 <- permutations(testVector3, 6)
identical(t1, t4)
#> [1] TRUE
Benchmark:
microbenchmark(cbRcppAlgos = permuteGeneral(testVector3, 6),
cbGtools = gtools::permutations(10, 6, testVector3),
cbArrangements = permutations(testVector3, 6),
unit = "relative")
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgos 1.079 1.027 1.106 1.037 1.003 5.37 100
#> cbGtools 158.720 92.261 85.160 91.856 80.872 45.39 100
#> cbArrangements 1.000 1.000 1.000 1.000 1.000 1.00 100
Next, we examine permutations without replacement with a general vector (returning all permutations).
RcppAlgos
gtools
combinat
multicool
arrangements
How to:
library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(89)
testVector3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29))
testVector3Prime <- testVector3[1:7]
## For RcppAlgos, & gtools (see above)
## combinat
t4 <- combinat::permn(testVector3Prime) ## returns a list of vectors
## convert to a matrix
t4 <- do.call(rbind, t4)
## multicool.. we must first call initMC
t5 <- multicool::allPerm(multicool::initMC(testVector3Prime)) ## returns a matrix with n columns
all.equal(t4[do.call(order,as.data.frame(t4)),],
t5[do.call(order,as.data.frame(t5)),])
#> [1] TRUE
Benchmark:
microbenchmark(cbRcppAlgos = permuteGeneral(testVector3Prime, 7),
cbGtools = gtools::permutations(7, 7, testVector3Prime),
cbCombinat = combinat::permn(testVector3Prime),
cbMulticool = multicool::allPerm(multicool::initMC(testVector3Prime)),
cbArrangements = permutations(x = testVector3Prime, k = 7),
unit = "relative")
#> Unit: relative
#> expr min lq mean median uq max neval
#> cbRcppAlgos 1.152 1.275 0.7508 1.348 1.342 0.3159 100
#> cbGtools 965.465 817.645 340.4159 818.137 661.068 12.7042 100
#> cbCombinat 280.207 236.853 104.4777 238.228 208.467 9.6550 100
#> cbMulticool 2573.001 2109.246 851.3575 2039.531 1638.500 28.3597 100
#> cbArrangements 1.000 1.000 1.0000 1.000 1.000 1.0000 100
Now, we examine permutations without replacement for 1:n
(returning all permutations).
RcppAlgos
gtools
combinat
multicool
partitions
arrangements
How to:
library(RcppAlgos)
library(arrangements)
library(microbenchmark)
options(digits = 4)
set.seed(89)
t1 <- partitions::perms(7) ## returns an object of type 'partition' with n rows
identical(t(as.matrix(t1)), permutations(7,7))
#> [1] TRUE
Benchmark:
microbenchmark(cbRcppAlgos = permuteGeneral(7, 7),
cbGtools = gtools::permutations(7, 7),
cbCombinat = combinat::permn(7),
cbMulticool = multicool::allPerm(multicool::initMC(1:7)),
cbPartitions = partitions::perms(7),
cbArrangements = permutations(7, 7),
unit = "relative")
#> Unit: relative
#> expr min lq mean median uq max
#> cbRcppAlgos 1.235 1.429 1.412 1.503 1.484 1.720
#> cbGtools 1152.826 1000.736 812.620 939.565 793.373 499.029
#> cbCombinat 347.446 304.866 260.294 296.521 248.343 284.001
#> cbMulticool 3001.517 2416.716 1903.903 2237.362 1811.006 131