Part 1 An Overview of The Futureverse

1.1 Why do we parallelize?

Parallel & distributed processing can be used to:

speed up processing (wall time)
lower memory footprint (per machine)
avoid data transfers (compute where data lives)
other reasons, e.g. asynchronous UI/UX

1.2 The future package (the core of it all)

The hexlogo for the ‘future’ package adopted from original designed by Dan LaBar

A simple, unifying solution for parallel APIs
“Write once, run anywhere”
100% cross platform, e.g. Linux, macOS, MS Windows
Easy to install (< 0.5 MiB total); install.packages("future")
Well tested, lots of CPU mileage, used in production
Things should “just work”
Design goal: keep as minimal as possible

1.3 Quick intro: Evaluate R in the background

1.3.1 Sequentially

x <- 7
y <- slow(x)           # ~1 minute
z <- another(x)        # ~0.5 minute
                       # all done in ~1.5 minutes

1.3.2 In parallel

library(future)
plan(multisession)     # run things in parallel

x <- 7
f <- future(slow(x))   # ~1 minute (in background)   ‎
z <- another(x)        # ~0.5 minute (in current R session)
y <- value(f)          # get background results
                       # all done in ~1 minutes

1.4 Quick intro: Parallel base-R apply

1.4.1 Sequentially

x <- 1:20
y <- lapply(x, slow)          # ~ 20 minutes

1.4.2 In parallel

library(future.apply)
plan(multisession, workers = 4)

x <- 1:20
y <- future_lapply(x, slow)   # ~ 5 minutes

1.5 Quick intro: Parallel tidyverse apply

1.5.1 Sequentially

library(purrr)

x <- 1:20
y <- map(x, slow)          # ~20 minutes

1.5.2 In parallel

library(furrr)
plan(multisession, workers = 4)

x <- 1:20
y <- future_map(x, slow)   # ~5 minutes

1.6 Quick intro: Parallel foreach

1.6.1 Sequentially

library(foreach)

x <- 1:20
y <- foreach(z = x) %do% slow(z)     # ~20 minutes

Comment: Technically, we want to use y <- foreach(z = x) %do% local({ slow(x) }) here.

1.6.2 In parallel

library(doFuture)
registerDoFuture()
plan(multisession, workers = 4)

x <- 1:20
y <- foreach(z = x) %dopar% slow(z)  # ~5 minutes

1.7 What is the Futureverse?

A Unifying Parallelization Framework in R for Everyone
Require only minimal changes to parallelize existing R code
“Write once, Parallelize anywhere”
Same code regardless of operating system and parallel backend
Lower the bar to get started with parallelization
Fewer decisions for the developer to make
Stay with your favorite coding style
Worry-free: globals, packages, output, warnings, errors just work
Statistically sound: Built-in parallel random number generation (RNG)
Correctness and reproducibility of highest priority
“Future proof”: Support any new parallel backends to come

1.7.1 Packages part of the Futureverse

Core API:

future

Map-reduce API:

future.apply
furrr
doFuture (used with foreach, plyr, and BiocParallel)

Parallel backends:

parallel (local, MPI, (remote))
future.callr (local)
future.batchtools (HPC job schedulers)
more to come

Additional packages:

progressr (progress updates, also in parallel)

The first CRAN release was on 2015-06-19, but the initial seed toward building the framework was planted back in 2005. It all grew out of collaborative, real-world research needs of large-scale scientific computations in Genomics and Bioinformatics on all operating systems.

1.7.2 Who is it for?

Everyone using R
Users with some experience in R, but no need to be an advanced R developer
Anyone who wishes to run many slow, repetitive tasks
Any developer who want to support parallel processing without having to worry about the details and having to maintain parallel code
Anyone who wishes to set up an asynchronous Shiny app

1.7.3 Who are using it?

https://www.futureverse.org/usage.html

1.7.4 What about its quality and stability?

The future package on CRAN since 2015 (exactly 7 years ago)
The API is stable and rarely changes
Very few breaking changes since the start
225 CRAN packages rely on it (https://www.futureverse.org/statistics.html)
Top-1% most downloaded package on CRAN (https://www.futureverse.org/statistics.html)
Every release is well tested (https://www.futureverse.org/quality.html)
I try to work closely with package developers, e.g. deprecation, or issues with design patterns

1.7.5 Support

Please use:

Website: https://www.futureverse.org
Package help pages: https://future.futureverse.org, https://future.apply.futureverse.org, …
Discussions, questions and answers: https://github.com/HenrikBengtsson/future/discussions
Bug reports: https://github.com/HenrikBengtsson/future/issues

1.7.6 How to stay up-to-date

Blog: https://www.futureverse.org/blog.html (feed on https://www.jottr.org/)
Twitter: #RStats (often with #parallel and #HPC)