matrixStats.benchmarks


sum2() benchmarks

This report benchmark the performance of sum2() against alternative methods.

Alternative methods

as below

> sum2_R <- function(x, na.rm = FALSE, idxs) {
+     sum(x[idxs], na.rm = na.rm)
+ }

Data type “integer”

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), na_prob = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         x <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (na_prob > 0) 
+         x[sample(n, size = na_prob * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n = %d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = mode)

Results

n = 1000 vector

All elements

> x <- data[["n = 1000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352391 285.9    7916910 422.9  7916910 422.9
Vcells 37519670 286.3   60997542 465.4 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 1000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 0.001168 0.001216 0.0013982 0.0012895 0.0013475 0.004206
1 sum2 0.004819 0.004942 0.0059565 0.0050265 0.0052650 0.029968
  expr min lq mean median uq max
2 sum 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 4.125856 4.064145 4.260034 3.898023 3.907236 7.125059

Figure: Benchmarking of sum2() and sum() on n = 1000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350229 285.8    7916910 422.9  7916910 422.9
Vcells 15884915 121.2   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.002353 0.0026730 0.0035542 0.0029875 0.0039720 0.025916
1 sum2 0.003939 0.0042425 0.0052645 0.0044535 0.0054965 0.025251
  expr min lq mean median uq max
2 sum+[() 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000
1 sum2 1.674033 1.587168 1.481204 1.490711 1.383812 0.9743402

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350295 285.8    7916910 422.9  7916910 422.9
Vcells 15885067 121.2   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.003277 0.003511 0.0039569 0.0036315 0.0038375 0.025466
1 sum2 0.004710 0.004900 0.0053899 0.0050465 0.0053270 0.024170
  expr min lq mean median uq max
2 sum+[() 1.00000 1.000000 1.000000 1.000000 1.000000 1.0000000
1 sum2 1.43729 1.395614 1.362148 1.389646 1.388143 0.9491086

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350358 285.8    7916910 422.9  7916910 422.9
Vcells 15885821 121.2   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.004771 0.0050155 0.0053266 0.0051910 0.005371 0.017107
1 sum2 0.006047 0.0062185 0.0066067 0.0064505 0.006628 0.020971
  expr min lq mean median uq max
2 sum+[() 1.000000 1.000000 1.00000 1.000000 1.000000 1.000000
1 sum2 1.267449 1.239856 1.24032 1.242632 1.234035 1.225872

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 10000 vector

All elements

> x <- data[["n = 10000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350356 285.8    7916910 422.9  7916910 422.9
Vcells 15885445 121.2   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 10000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 0.008725 0.0087605 0.0089259 0.0088635 0.008942 0.014956
1 sum2 0.019954 0.0201270 0.0204627 0.0202520 0.020422 0.036140
  expr min lq mean median uq max
2 sum 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 2.286991 2.297472 2.292511 2.284876 2.283829 2.416422

Figure: Benchmarking of sum2() and sum() on n = 10000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350484 285.8    7916910 422.9  7916910 422.9
Vcells 15886503 121.3   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.009582 0.0100370 0.0107898 0.0102045 0.0107000 0.028421
1 sum2 0.010930 0.0111795 0.0115658 0.0113155 0.0114935 0.027050
  expr min lq mean median uq max
2 sum+[() 1.00000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 1.14068 1.113829 1.071919 1.108874 1.074159 0.951761

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350547 285.8    7916910 422.9  7916910 422.9
Vcells 15887775 121.3   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.017435 0.018203 0.0194375 0.0184870 0.0191675 0.051430
1 sum2 0.018530 0.018993 0.0199242 0.0193585 0.0197810 0.046702
  expr min lq mean median uq max
2 sum+[() 1.000000 1.000000 1.00000 1.000000 1.000000 1.0000000
1 sum2 1.062805 1.043399 1.02504 1.047141 1.032007 0.9080692

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350610 285.8    7916910 422.9  7916910 422.9
Vcells 15890076 121.3   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.032296 0.033896 0.0344275 0.0341810 0.0344325 0.046596
1 sum2 0.032661 0.034108 0.0347426 0.0343965 0.0346050 0.049085
  expr min lq mean median uq max
2 sum+[() 1.000000 1.000000 1.000000 1.000000 1.00000 1.000000
1 sum2 1.011302 1.006254 1.009151 1.006305 1.00501 1.053417

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 100000 vector

All elements

> x <- data[["n = 100000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350608 285.8    7916910 422.9  7916910 422.9
Vcells 15890007 121.3   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 100000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 0.068167 0.0718690 0.0790733 0.0786455 0.085223 0.130131
1 sum2 0.137269 0.1470985 0.1563849 0.1554880 0.171551 0.199221
  expr min lq mean median uq max
2 sum 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 2.013716 2.046759 1.977721 1.977074 2.012966 1.530927

Figure: Benchmarking of sum2() and sum() on n = 100000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350736 285.8    7916910 422.9  7916910 422.9
Vcells 15896466 121.3   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.066175 0.0712885 0.0755466 0.074185 0.0799685 0.124713
2 sum+[() 0.068085 0.0735490 0.0782850 0.078106 0.0826760 0.098722
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000
2 sum+[() 1.028863 1.031709 1.036247 1.052854 1.033857 0.7915935

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350799 285.8    7916910 422.9  7916910 422.9
Vcells 15906878 121.4   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.120044 0.1245475 0.1370164 0.1291520 0.1539495 0.173473
2 sum+[() 0.126368 0.1331820 0.1463841 0.1445065 0.1594265 0.197184
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.052681 1.069327 1.068369 1.118887 1.035577 1.136684

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350862 285.8    7916910 422.9  7916910 422.9
Vcells 15926920 121.6   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.203559 0.2181945 0.2489740 0.2394055 0.2719680 0.447498
2 sum+[() 0.213236 0.2318810 0.2650346 0.2562470 0.2939635 0.429438
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000
2 sum+[() 1.047539 1.062726 1.064507 1.070347 1.080875 0.9596423

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 1000000 vector

All elements

> x <- data[["n = 1000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350860 285.8    7916910 422.9  7916910 422.9
Vcells 15926988 121.6   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 1000000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 0.520624 0.5218835 0.5805353 0.5332435 0.635164 0.894909
1 sum2 1.030878 1.0325495 1.1229403 1.0373390 1.153496 1.780989
  expr min lq mean median uq max
2 sum 1.000000 1.000000 1.000000 1.000000 1.00000 1.000000
1 sum2 1.980082 1.978506 1.934319 1.945338 1.81606 1.990134

Figure: Benchmarking of sum2() and sum() on n = 1000000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5350988 285.8    7916910 422.9  7916910 422.9
Vcells 15987446 122.0   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.469893 0.4836925 0.5713289 0.5306825 0.6128710 0.991029
2 sum+[() 0.516228 0.5354775 0.6459817 0.5947250 0.7353905 0.966928
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.00000 1.000000 1.0000000
2 sum+[() 1.098608 1.107062 1.130665 1.12068 1.199911 0.9756808

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351051 285.8    7916910 422.9  7916910 422.9
Vcells 16087490 122.8   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.944178 0.9681965 1.064035 1.001398 1.084610 1.674459
2 sum+[() 1.024512 1.0728995 1.338005 1.327036 1.568442 2.334691
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.085084 1.108142 1.257483 1.325183 1.446087 1.394296

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351114 285.8    7916910 422.9  7916910 422.9
Vcells 16288062 124.3   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 1.892380 1.932612 2.074628 1.960306 2.137136 3.298452
2 sum+[() 2.062795 2.120582 2.656786 2.400110 3.044909 12.688548
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.090053 1.097262 1.280609 1.224355 1.424761 3.846819

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 10000000 vector

All elements

> x <- data[["n = 10000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351112 285.8    7916910 422.9  7916910 422.9
Vcells 16287686 124.3   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 10000000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 6.009179 6.159402 6.451626 6.263462 6.50239 10.22468
1 sum2 10.978285 11.118918 11.571511 11.308272 11.66396 15.89741
  expr min lq mean median uq max
2 sum 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 1.826919 1.805195 1.793581 1.805435 1.793796 1.554808

Figure: Benchmarking of sum2() and sum() on n = 10000000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351240 285.8    7916910 422.9  7916910 422.9
Vcells 16888144 128.9   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 6.931383 7.039096 7.208666 7.09559 7.221058 8.88509
2 sum+[() 7.915536 10.491024 10.942090 10.62573 10.741666 32.37583
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.141985 1.490394 1.517908 1.497511 1.487547 3.643838

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351303 285.8    7916910 422.9  7916910 422.9
Vcells 17888820 136.5   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 11.47321 12.05348 12.52430 12.37598 12.80797 16.59144
2 sum+[() 14.05279 15.26082 18.46393 18.81510 19.21084 33.62563
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.224835 1.266092 1.474248 1.520292 1.499913 2.026685

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351366 285.8    7916910 422.9  7916910 422.9
Vcells 19888862 151.8   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 20.58196 21.19114 22.11400 21.82266 22.46726 26.19977
2 sum+[() 31.80743 33.68354 36.16748 35.12630 38.10734 45.52142
  expr min lq mean median uq max
1 sum2 1.000000 1.00000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.545404 1.58951 1.635502 1.609625 1.696128 1.737474

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

Data type “double”

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), na_prob = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         x <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (na_prob > 0) 
+         x[sample(n, size = na_prob * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n = %d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = mode)

Results

n = 1000 vector

All elements

> x <- data[["n = 1000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351364 285.8    7916910 422.9  7916910 422.9
Vcells 25443986 194.2   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 1000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 0.001648 0.0021740 0.0023646 0.002207 0.002557 0.006714
1 sum2 0.005059 0.0059845 0.0071861 0.006402 0.007165 0.033659
  expr min lq mean median uq max
2 sum 1.000000 1.00000 1.000000 1.00000 1.000000 1.000000
1 sum2 3.069782 2.75276 3.039051 2.90077 2.802112 5.013256

Figure: Benchmarking of sum2() and sum() on n = 1000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351489 285.9    7916910 422.9  7916910 422.9
Vcells 21445299 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.00247 0.0027035 0.0030827 0.002908 0.0030575 0.016484
1 sum2 0.00400 0.0042025 0.0045563 0.004350 0.0045815 0.019722
  expr min lq mean median uq max
2 sum+[() 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 1.619433 1.554466 1.478048 1.495874 1.498446 1.196433

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351555 285.9    7916910 422.9  7916910 422.9
Vcells 21445448 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.003078 0.0033810 0.0037168 0.0035325 0.0037015 0.019633
1 sum2 0.004574 0.0047675 0.0051189 0.0048660 0.0051105 0.021225
  expr min lq mean median uq max
2 sum+[() 1.00000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 1.48603 1.410086 1.377209 1.377495 1.380656 1.081088

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351618 285.9    7916910 422.9  7916910 422.9
Vcells 21445690 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.004654 0.0049445 0.0054997 0.0051455 0.0053420 0.022807
1 sum2 0.006135 0.0063190 0.0068996 0.0065230 0.0067945 0.022205
  expr min lq mean median uq max
2 sum+[() 1.000000 1.000000 1.000000 1.00000 1.000000 1.0000000
1 sum2 1.318221 1.277986 1.254544 1.26771 1.271902 0.9736046

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 10000 vector

All elements

> x <- data[["n = 10000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351616 285.9    7916910 422.9  7916910 422.9
Vcells 21445314 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 10000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 0.012993 0.0131210 0.0135548 0.0131995 0.0133225 0.026639
1 sum2 0.020162 0.0205545 0.0211470 0.0207250 0.0209640 0.050766
  expr min lq mean median uq max
2 sum 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 1.551759 1.566535 1.560108 1.570135 1.573578 1.905702

Figure: Benchmarking of sum2() and sum() on n = 10000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351741 285.9    7916910 422.9  7916910 422.9
Vcells 21446367 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.010247 0.010542 0.0113332 0.010767 0.011137 0.030083
1 sum2 0.011082 0.011336 0.0118216 0.011512 0.011732 0.027191
  expr min lq mean median uq max
2 sum+[() 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 1.081487 1.075318 1.043093 1.069193 1.053425 0.903866

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351807 285.9    7916910 422.9  7916910 422.9
Vcells 21448339 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.016896 0.0178635 0.0191040 0.0183035 0.0186710 0.041301
1 sum2 0.017940 0.0189610 0.0194224 0.0191360 0.0194785 0.036391
  expr min lq mean median uq max
2 sum+[() 1.00000 1.000000 1.000000 1.000000 1.000000 1.0000000
1 sum2 1.06179 1.061438 1.016666 1.045483 1.043249 0.8811167

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351870 285.9    7916910 422.9  7916910 422.9
Vcells 21450381 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum+[() 0.030358 0.0317435 0.0362270 0.033332 0.034177 0.129086
1 sum2 0.031400 0.0329395 0.0370038 0.034093 0.035166 0.088019
  expr min lq mean median uq max
2 sum+[() 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000
1 sum2 1.034324 1.037677 1.021443 1.022831 1.028938 0.6818633

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 100000 vector

All elements

> x <- data[["n = 100000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351868 285.9    7916910 422.9  7916910 422.9
Vcells 21450005 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 100000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 0.101184 0.108809 0.1201335 0.1248855 0.1269145 0.153317
1 sum2 0.137283 0.147335 0.1597881 0.1585900 0.1717430 0.212284
  expr min lq mean median uq max
2 sum 1.000000 1.00000 1.000000 1.000000 1.000000 1.000000
1 sum2 1.356766 1.35407 1.330088 1.269883 1.353218 1.384608

Figure: Benchmarking of sum2() and sum() on n = 100000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5351993 285.9    7916910 422.9  7916910 422.9
Vcells 21456459 163.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.071794 0.0780115 0.0814247 0.0811285 0.0815925 0.143911
2 sum+[() 0.080814 0.0856615 0.0901845 0.0904820 0.0918645 0.126215
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000
2 sum+[() 1.125637 1.098062 1.107582 1.115292 1.125894 0.8770351

Figure: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352059 285.9    7916910 422.9  7916910 422.9
Vcells 21466508 163.8   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.120250 0.1247415 0.1349471 0.1294015 0.1442605 0.179994
2 sum+[() 0.132394 0.1378225 0.1531807 0.1513765 0.1619370 0.260184
  expr min lq mean median uq max
1 sum2 1.00000 1.000000 1.000000 1.00000 1.000000 1.000000
2 sum+[() 1.10099 1.104865 1.135117 1.16982 1.122532 1.445515

Figure: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352122 285.9    7916910 422.9  7916910 422.9
Vcells 21487648 164.0   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.209682 0.2303455 0.2523607 0.2447505 0.2637385 0.356900
2 sum+[() 0.228803 0.2502210 0.2763647 0.2705405 0.3052540 0.336282
  expr min lq mean median uq max
1 sum2 1.00000 1.000000 1.000000 1.000000 1.000000 1.0000000
2 sum+[() 1.09119 1.086286 1.095118 1.105373 1.157412 0.9422303

Figure: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 1000000 vector

All elements

> x <- data[["n = 1000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352120 285.9    7916910 422.9  7916910 422.9
Vcells 21487272 164.0   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 1000000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 1.090260 1.131690 1.181509 1.158953 1.215558 1.369708
1 sum2 1.376677 1.434922 1.489486 1.452148 1.506457 1.923593
  expr min lq mean median uq max
2 sum 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 sum2 1.262705 1.267946 1.260664 1.252982 1.239313 1.404382

Figure: Benchmarking of sum2() and sum() on n = 1000000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352245 285.9    7916910 422.9  7916910 422.9
Vcells 21547725 164.4   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 0.814006 0.916464 1.007555 0.948396 1.015804 1.891452
2 sum+[() 1.023710 1.203260 1.506603 1.574297 1.667591 2.364770
  expr min lq mean median uq max
1 sum2 1.00000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.25762 1.312938 1.495306 1.659957 1.641646 1.250241

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352311 285.9    7916910 422.9  7916910 422.9
Vcells 21647774 165.2   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 1.224094 1.352000 1.403731 1.377432 1.420805 1.810562
2 sum+[() 1.549871 1.596701 1.961652 1.666648 2.086470 10.645034
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.266137 1.180992 1.397456 1.209968 1.468512 5.879409

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352374 285.9    7916910 422.9  7916910 422.9
Vcells 21847816 166.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 2.118721 2.237180 2.406776 2.318640 2.551486 3.524042
2 sum+[() 2.552412 2.872447 3.884592 4.032706 4.226303 14.348704
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.204695 1.283959 1.614023 1.739255 1.656409 4.071661

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 10000000 vector

All elements

> x <- data[["n = 10000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352372 285.9    7916910 422.9  7916910 422.9
Vcells 21847440 166.7   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 10000000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
2 sum 9.602826 10.47414 11.06585 10.90738 11.30738 15.02775
1 sum2 12.078122 12.80481 13.35954 13.19347 13.74669 16.41842
  expr min lq mean median uq max
2 sum 1.000000 1.000000 1.000000 1.000000 1.000000 1.00000
1 sum2 1.257767 1.222516 1.207276 1.209591 1.215727 1.09254

Figure: Benchmarking of sum2() and sum() on n = 10000000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352497 285.9    7916910 422.9  7916910 422.9
Vcells 22447893 171.3   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 8.969557 9.379087 9.792749 9.581693 9.960026 11.55101
2 sum+[() 13.421488 17.261918 18.552058 17.725341 18.571745 31.79983
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.496338 1.840469 1.894469 1.849918 1.864628 2.752992

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352563 285.9    7916910 422.9  7916910 422.9
Vcells 23449287 179.0   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 13.63601 13.99001 14.51047 14.35626 14.83623 17.38676
2 sum+[() 20.06319 27.97003 29.21771 28.61166 29.74101 35.96571
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.00000 1.000000 1.00000 1.000000
2 sum+[() 1.471339 1.999287 2.01356 1.992975 2.00462 2.068569

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  5352626 285.9    7916910 422.9  7916910 422.9
Vcells 25449329 194.2   48798034 372.3 94934136 724.3
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

  expr min lq mean median uq max
1 sum2 23.31134 23.79229 24.42539 24.23359 24.95218 27.04734
2 sum+[() 44.67206 48.05749 55.42160 52.91522 62.69120 73.59090
  expr min lq mean median uq max
1 sum2 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
2 sum+[() 1.916323 2.019877 2.269016 2.183548 2.512453 2.720819

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

Appendix

Session information

R version 4.1.1 Patched (2021-08-10 r80727)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /home/hb/software/R-devel/R-4-1-branch/lib/R/lib/libRblas.so
LAPACK: /home/hb/software/R-devel/R-4-1-branch/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] microbenchmark_1.4-7   matrixStats_0.60.0     ggplot2_3.3.5         
[4] knitr_1.33             R.devices_2.17.0       R.utils_2.10.1        
[7] R.oo_1.24.0            R.methodsS3_1.8.1-9001 history_0.0.1-9000    

loaded via a namespace (and not attached):
 [1] Biobase_2.52.0          httr_1.4.2              splines_4.1.1          
 [4] bit64_4.0.5             network_1.17.1          assertthat_0.2.1       
 [7] highr_0.9               stats4_4.1.1            blob_1.2.2             
[10] GenomeInfoDbData_1.2.6  robustbase_0.93-8       pillar_1.6.2           
[13] RSQLite_2.2.8           lattice_0.20-44         glue_1.4.2             
[16] digest_0.6.27           XVector_0.32.0          colorspace_2.0-2       
[19] Matrix_1.3-4            XML_3.99-0.7            pkgconfig_2.0.3        
[22] zlibbioc_1.38.0         genefilter_1.74.0       purrr_0.3.4            
[25] ergm_4.1.2              xtable_1.8-4            scales_1.1.1           
[28] tibble_3.1.4            annotate_1.70.0         KEGGREST_1.32.0        
[31] farver_2.1.0            generics_0.1.0          IRanges_2.26.0         
[34] ellipsis_0.3.2          cachem_1.0.6            withr_2.4.2            
[37] BiocGenerics_0.38.0     mime_0.11               survival_3.2-13        
[40] magrittr_2.0.1          crayon_1.4.1            statnet.common_4.5.0   
[43] memoise_2.0.0           laeken_0.5.1            fansi_0.5.0            
[46] R.cache_0.15.0          MASS_7.3-54             R.rsp_0.44.0           
[49] progressr_0.8.0         tools_4.1.1             lifecycle_1.0.0        
[52] S4Vectors_0.30.0        trust_0.1-8             munsell_0.5.0          
[55] tabby_0.0.1-9001        AnnotationDbi_1.54.1    Biostrings_2.60.2      
[58] compiler_4.1.1          GenomeInfoDb_1.28.1     rlang_0.4.11           
[61] grid_4.1.1              RCurl_1.98-1.4          cwhmisc_6.6            
[64] rstudioapi_0.13         rappdirs_0.3.3          startup_0.15.0-9000    
[67] labeling_0.4.2          bitops_1.0-7            base64enc_0.1-3        
[70] boot_1.3-28             gtable_0.3.0            DBI_1.1.1              
[73] markdown_1.1            R6_2.5.1                lpSolveAPI_5.5.2.0-17.7
[76] rle_0.9.2               dplyr_1.0.7             fastmap_1.1.0          
[79] bit_4.0.4               utf8_1.2.2              parallel_4.1.1         
[82] Rcpp_1.0.7              vctrs_0.3.8             png_0.1-7              
[85] DEoptimR_1.0-9          tidyselect_1.1.1        xfun_0.25              
[88] coda_0.19-4            

Total processing time was 1.39 mins.

Reproducibility

To reproduce this report, do:

html <- matrixStats:::benchmark('sum2')

Copyright Henrik Bengtsson. Last updated on 2021-08-25 22:49:31 (+0200 UTC). Powered by RSP.