Beyond Sequential: Scaling Existing Medical Pipelines with ‘futurize’

Medical research is powered by trusted R packages

`boot`	Bootstrap resampling for robust confidence intervals
`lme4`	Mixed-effects models for longitudinal patient data
`survival`	Time-to-event analysis for clinical endpoints
`DESeq2`	Differential expression in RNA-seq data
`scater`	Single-cell RNA-seq analysis, e.g. PCA, t-SNE, UMAP
…	…

Packages are distributed via the highly-trusted CRAN and Bioconductor repositories:

With supporting repositories such as R-universe, Pharmaverse, and R-multiverse:

Growing datasets make sequential analysis a bottleneck

Bootstrapping confidence intervals - 1,000 replicates × large cohort = hours
Mixed models across patient subgroups - dozens of subgroup fits, one after another
Simulation studies for sample-size planning - 10,000 iterations on a laptop overnight

← 32× CPU cores sitting idle

Parallelization can help →

Barrier is friction!

Parallelizing used to require abandoning existing code

Before: sequential, readable, auditable

res <- lapply(patients, fit_model)

Refactoring tax

Many parallelization APIs
Complicated to test
Expensive to maintain

After: parallel, hard to read, hard to maintain

library(parallel)

if (parallel) {
  if (.Platform$OS.type == "windows") {
    cl <- makeCluster(8)
    res <- parLapply(cl, patients, fit_model)
    stopCluster(cl)
  } else {
    options(mc.cores = 8)
    res <- mclapply(patients, fit_model)
  }
} else {
  res <- lapply(patients, fit_model)
}

Let’s look at how paralisation in our used to be in the past. So consider a very simple sequential code where we apply a function fit model to a set of patients and we use the lapply() function in base R for this. That code is very readable, it’s easy to understand, someone can validate it and understand what’s going on. If you have a lot of patients, so the fifth model takes a long time. We are typically looking into parallelization. Now, traditionally in order to do that, we could use the parallel package. And when we use that and we do code that we want to share with collaborators, some are on Windows, some on Linux, some on Mac, or we typically end up with code like to the right, if we want to run parallel, the first thing we do is like are we running on Windows, then we set up a cluster of workers in this case eight parallel workers and then we call the par lapply() function. And when done we’ll have to remember to stop the cluster of workers. Alternatively, if we’re not on Windows, we can use the mclapply() function which use fork parallelism and that’s case we said it it we allows it to run eight workers and we parallelize. And if you don’t want to run in parallel, we can fall back to regular lapply() function.

Futureverse was introduced to lower refactoring tax

Before: sequential, readable, auditable

After: parallel, readable, maintainable



res <- lapply(patients, fit_model)

library(future.apply)
plan(multisession)
res <- future_lapply(patients, fit_model)

So that was the type of code I used to write to maintain functions that anyone could work on Windows, Mac, Linux, small and big computers. But that became unmanageable and it was really hard to test. So I scratched that in 2015. I started to work on what is today the Futureverse ecosystem. I started out by providing a unified interface for parallelization. That was the goal. And the goal was to provide researchers and other art developers to separate what to parallelize and how to parallelize. So, instead of writing platform-specific code, you could just define a future backend, the plan() function and say I want to use the ‘multisession’ for local parallelization and then you just pick the corresponding “future” function that mimics what you used to work. So for example, if you used to use lapply(), you could use the future_lapply() version that is available from the ‘future.apply’ package. And we can see to the right how the code looks like after palestation.

library(purrr)

res <- map(patients, fit_model)

library(furrr)
plan(multisession)
res <- future_map(patients, fit_model)

library(foreach)

res <- foreach(p = patients) %do% {
  fit_model(p) 
}

library(doFuture)
plan(multisession)
res <- foreach(p = patients) %dofuture% {
  fit_model(p) 
}

The R community has embraced the Futureverse

+30% reverse dependencies yearly

Top 0.7% most downloaded

… but, we can simplify it further with ‘futurize’

All you need to remember is futurize()

So, as I said before, one of my objectives with the Futureverse ecosystem is to minimize the friction of parallelization as much as possible. Ideally a researcher should be able to parallelize their code without modifying a single function name or restructuring the code. The goal is to take code that has already been validated, or maybe even peer-reviewed, and enable parallel execution with a single transparent step. So, these functions and packages are already shown, they do a great job getting towards that, but I would say we’re not there yet. They’re not perfect. We can simplify it even more. And that’s the purpose of the ‘futurize’ package. It acts as a single adapter that allows you to maintain your original code by simply piping to the futurize() function. That’s all. It removes most of the needs for learning a new API and rewriting existing code.

New package futurize preserves your original code

Universal adapter: One unifying function futurize()
Zero rewrites: Original logic unchanged

res <- lapply(patients, fit_model) |>
         futurize()

library(purrr)
res <- map(patients, fit_model) |>
         futurize()

library(foreach)
res <- foreach(p = patients) %do% {
  fit_model(p)
} |> futurize()

library(plyr)
res <- llply(patients, fit_model) |>
         futurize()

library(BiocParallel)
res <- bplapply(patients, fit_model) |>
         futurize()

library(crossmap)
res <- xmap(x, ~ .y * .x) |>
         futurize()

Easy!

Same code scales from laptop to cloud to HPC

Without changing any code, you can switch from local and remote parallel processing, to large-scale high-performance compute (HPC) processing:

`plan()`	Environment	Use case
`sequential`	Single machine	Sequential (default, debugging)
`multisession`	Single machine	Parallel across multiple cores
`mirai_multisession`	Single machine	Same as above; powered by mirai
`cluster`	Many machines	Parallel across many machines (desktops, cloud)
`batchtools_*`	Slurm/SGE/LSF	Scheduler-based HPC clusters

Like everything else in Futureverse, the futurize() function can take full advantage of all the parallel backends available in the ecosystem. You just specify the execution plan at the beginning of your script or your code.

The default is ‘sequential’ processing and that can be useful for local debugging.

The next step is typically to parallelize on your local computer and take advantage of all the cores and that’s you can use the built-in ‘multisession’ backend. There is also the ‘mirai_multisession’, which provides an alternative to ‘multisession’ with less latency and it comes from the ‘mirai’ package.

You can also scale out to computers, local computers, remote computers by using the ‘cluster’ backend. All you need to have is SSH access and R running on them.

Some of us have access to high-performance-compute clusters, where you submit jobs to a queue which then where the jobs are then distributed out to thousands of cores and that’s done by different job schedulers like Slurm, SGE, LSF, Torque, PBS, and so on.

library(futurize)
plan(future.batchtools::batchtools_slurm)
res <- patients |> purrr::map(fit_model) |> futurize()

… we can do even more with ‘futurize’

All you need to remember is futurize()

futurize() works with a growing set of domain-specific packages

CRAN Package	Use
boot	Bootstrap resampling, confidence intervals
caret	Classification and regression training
fwb	Bootstrap resampling, confidence intervals
gamlss	Generalized additive models (GAMLSS)
glmnet	Lasso and elastic-net regularization
glmmTMB	Generalized linear mixed models (GLMMs)
kernelshap	Kernel SHAP (Shapley Additive Explanations)
lme4	Linear and non-linear mixed-effects models
metafor	Meta-analysis models
mgcv	Generalized additive models (GAMs)
partykit	Recursive partitioning (trees)
riskRegression	Risk regression for survival analysis
seriation	Data ordering (seriation)
stars	Spatiotemporal data cubes
structchange	Testing for structural changes
tm	Text mining
vegan	Community ecology

Bioconductor Package	Use
BiocParallel	Map-reduce and parallel infrastructure
DESeq2	Differential gene expression analysis
GenomicAlignments	Genomic alignments (BAM/CRAM)
GSVA	Gene set variation analysis
Rsamtools	Binary alignment (BAM) and tabix utilities
scater	Single-cell transformations
scuttle	Single-cell analysis utilities
SingleCellExperiment	Single-cell data containers
sva	Surrogate variable analysis

Bootstrap simulations accelerated with futurize()

Sequential:

library(boot)
b <- boot(data = cohort, statistic = cox_stat, R = 100e3)

100,000 bootstrap replicates takes hours on large cohorts!

A single worker

Parallel:

plan(future.mirai::mirai_multisession)
library(boot)
b <- boot(data = cohort, statistic = cox_stat, R = 100e3) |>
     futurize()

Faster when distributed across parallel workers.
Identical results.

32 parallel workers

Traditional parallelization is more cumbersome and less robust

library(boot)

library(parallel)
cl <- makeCluster(32)

b <- boot(data = cohort, statistic = cox_stat, R = 100e3,
          parallel = "snow", ncpus = length(cl), cl = cl)

stopCluster(cl)

Parallelization arguments blur the bootstrapping logic
All three parallelization arguments must be specified
Does not interrupt nicely
Does not handle crashed parallel workers
Not easy to scale to cloud or HPC job schedulers

Yes, we can do progress reporting too

All you need to remember is progressify()

Progress reporting with ‘progressify’

Vanilla call:

res <- lapply(patients, fit_model)

With progress reporting:

library(progressify)
handlers("cli", globals = TRUE)

res <- lapply(patients, fit_model) |> 
         progressify()

■■■■■■■■■■■■■■■■80% | ETA: 12m

In parallel with progress reporting:

library(progressify)
handlers("cli", globals = TRUE)

library(futurize)
plan(multisession)

res <- lapply(patients, fit_model) |> 
         progressify() |> futurize()

■■■■■■■■■■■■■■■■80% | ETA: 23s

Easy!

Structured concurrency allows for automatic optimization

Because futurize() limits the life-span of the parallel tasks, it can:

cancel remaining parallel tasks
- if there is an error
- if the user or the operating system requests an interrupt
estimate efficiency of parallelization
- is it worth it?
- suggest a better parallel backend
optimize distribution of objects to parallel workers
- by chunking
- by remote caching
- via shared memory (e.g. new mori package by C. Gao 2026)
be agile to resource specifications
- memory, run-time, GPU, …

Before I wrap up this presentation, I would like to say that the ‘futurize’ package implements a principle known as structured concurrency. This is because it managed the lifecycle of each parallel tasks and, because of that, it can provide quite advanced features automatically. For example, if a worker process encounters an error or gets interrupted, futurize() can immediately cancel the remaining parallel workers and tasks running, which prevents resources from being wasted. It can handle interrupts gracefully.

But looking ahead we can also imagine that it estimates the efficiency of a parallelization. It can try to run things sequentially and in parallel on different backends so it can conclude if it’s worth parallelizing the call. It can suggest a better parallel backend than the one you’re using.

It can also optimize the distribution of objects to parallel workers. We already do this with chunking, but you can imagine remote caching, or by using shared memory, which some of you might have heard of, e.g. the ‘mori’ package by Charlie Gao that just came out. I also want to point out that Bioconductor have had the ‘SharedObject’ package, which was a little bit more complicated to use, but they have been using it for several years with this spirit.

We can also allow to implement features that are on the roadmap, where you can specify resources that you need, like the amount of memory, the amount of runtime, if you need a GPU, and so on.

So with that, I want to say even though we’ve done and achieved a lot thus far, there is a lot more on the roadmap that will improve things and you do do not have to do anything. Your code using these functions will perform better over the next generations of package releases.

Go compute and may the future be with you!

Easy to install:

install.packages(c("futurize", "progressify"))

Easy to use:

ys <- lapply(xs, fcn) |> progressify() |> futurize()

Stay with your favorite coding style:

ys <- xs |> map(fcn) |> progressify() |> futurize()

Available elsewhere too:

ys <- glmnet::cv.glmnet(x, y) |> futurize()

https://www.futureverse.org