Replace one of the parent environments with another

replace_env(envir, search, replace, update_parent = TRUE)

Arguments

envir

An environment or an object with an environment, e.g. a function or a formula.

search

A environment, among the parents of envir, to be replaced. It is possible to specify a list of alternative environments.

replace

A environment.

update_parent

If TRUE, or 1L, the parent environment of replace is set to the parent environment of the replaced "search" environment. If FALSE, or 0L, it is not updated. If a positive integer greater than one, then that parent generation is updated, e.g. update_parent = 2L will update the parent environment of the parent of replace.

Value

Invisibly, the replaced environment.

Serialization of functions

Consider below function f() where pi is part of environment(f), which is a local environment, and a is a global variable part of parent_env(f).

cargo <- rnorm(1e6)
a <- 2
f <- local({
  pi <- 3.14
  function() {
    n <- 4
    a * pi / n
  }
})

We can visualize this as:

+----------------------+
| parent_env(f):       |
| cargo = { 1e6 }      |
| a = 2                |
| f                    |
+----------------------+
           ^
           |
+----------------------+
| environment(f):      |
| pi = 3.14            |
+----------------------+
           ^
           |
+======================+
| f():                 | (frame at runtime)
| n = 4                |
+======================+

In order to evaluate f(), variables a and pi, which are global ("free") variables defined outside and not at runtime in the call frame, like n is. To clarify further what the difference is: we cannot query n from f, but we can query both pi and a as environment(f)$pi and parent_env(f)$a. Similarly, we can also do parent_env(f)$cargo, but it is a variable useless for evaluating f().

When we serialize f (e.g. export it to a parallel worker), the body and the formals of the function is included, as well as all the environments of f up to where f itself lives, e.g. environment(f) and parent_env(f) in our example. However, if the environment where f lives is the global environment (= globalenv()), then it is not part of the serialization output. Imagine we save f to file, restart R, and load it back, e.g.

saveRDS(f, "f.rds")
quit(save = "no")
f <- readRDS(f)

In this case, we will lose a and cargo, which is good and bad. It's bad, because we need to bring a back, in order to evaluate f(), e.g.

f()
#> Error in f() : object 'a' not found

It's good, because we don't have to pay the price of serializing the large cargo object. Continuing, our unserialized f() looks like:

+----------------------+
| parent_env(f):       | (= globalenv())
| f                    |
+----------------------+
           ^
           |
+----------------------+
| environment(f):      |
| pi = 3.14            |
+----------------------+
           ^
           |
+======================+
| f():                 | (frame at runtime)
| n = 4                |
+======================+

One way to revive a is to inject a new grandparent environment that holds a copy of a, e.g.

new <- new.env(parent = parent_env(f))
new$a <- 2
parent.env(environment(f)) <- new

+-----------------------+
| parent_env(f, n = 2): | (= globalenv())
| f                     |
+-----------------------+
           ^
           |
+-----------------------+
| parent_env(f):        | (injected environment)
| a = 2                 |
+-----------------------+
           ^
           |
+-----------------------+
| environment(f):       |
| pi = 3.14             |
+-----------------------+
           ^
           |
+=======================+
| f():                  | (frame at runtime)
| n = 4                 |
+=======================+

and we can evaluate f() again;

f()
#> 1.57

We can of course build up the above version of f() before serializing, e.g before we save to file above. Then it is ready to use when unserialized, e.g. read back from file. An alternative way to achieve this is to use the replace_env() function;

new <- as.environment(list(a = a))
replace_env(f, search = find_object(value = f)$envir, replace = new)

If we save this to file, restart R, and load it back in, we'll see that we have a fully functional version of f, e.g. f() gives 1.57.

Another, less verbose, alternative is to use prune_fcn();

globals <- list(a = a)
prune_fcn(f, globals = globals)

And we can even identify global variables automatically:

prune_fcn(f, globals = get_globals(f))

which is the same as the default:

Examples

a <- 42
f <- local(function() a)

f_envs <- parent_envs(f, until = environment(), extra = 1L)
names(f_envs)
#> [1] "0x55f4131a14d0" "0x55f413237d88" "0x55f41a063730"
y <- f()
y
#> [1] 42

new <- as.environment(list(a = 13, pi = 3.14))
old <- replace_env(f, search = environment(), replace = new)
old
#> <environment: 0x55f413237d88>

f2_envs <- parent_envs(f, until = list(environment(), parent_env()))
names(f2_envs)
#> [1] "0x55f4131a14d0" "0x55f412d6e6f8" "0x55f41a063730"

## Note that f() will now see a = 13 in the replaced environment
## rather than a = 42 in the calling environment
z <- f()
z
#> [1] 13

## Undo changes
old2 <- replace_env(f, search = new, replace = old)
stopifnot(identical(old2, new))

f3_envs <- parent_envs(f, until = environment(), extra = 1L)
stopifnot(identical(f3_envs, f_envs))

## f() will now see a = 42 again
z <- f()
z
#> [1] 42

## ------------------------------------------------------------------------
## Example how to avoid huge local objects being part of a local function,
## which might be costly if the function is serialized, e.g. exported
## to a parallel workers
## ------------------------------------------------------------------------

## Call a function with the option to replace the function
## environment with a smaller temporary environment
do_call <- function(fcn, args = list(), envir = parent.frame(),
                    prune = FALSE) {
  if (prune) {
    fcn_where <- find_object(value = fcn, from = envir, which = "last")
    fcn_globals <- get_globals(fcn)
    new <- as.environment(fcn_globals)
    old <- replace_env(fcn, search = fcn_where$envir, replace = new)
    on.exit(replace_env(fcn, search = new, replace = old))
  }
  
  message(sprintf("Size of '%s': %s bytes",
          as.character(substitute(fcn)), size_of(fcn)))
 
  do.call(fcn, args = args, envir = envir)
}

my_fcn <- function(prune = FALSE) {
  cargo <- rnorm(1e6)
  
  n <- 2
  g <- local({
    pi <- 3.14
    function() n * pi
  })
  
  do_call(g, prune = prune)
}

my_fcn()
#> Size of 'g': 8056154 bytes
#> [1] 6.28
my_fcn(prune = TRUE)
#> Size of 'g': 55980 bytes
#> [1] 6.28

n <- 2  
g <- local({
  pi <- 3.14
  function() n * pi
})

my_fcn <- function(prune = FALSE) {
  cargo <- rnorm(1e6)
  do_call(g, prune = prune)
}

my_fcn()
#> Size of 'g': 55207 bytes
#> [1] 6.28
my_fcn(prune = TRUE)
#> Size of 'g': 51043 bytes
#> [1] 6.28


cargo <- rnorm(1e6)
n <- 2  
g <- local({
  pi <- 3.14
  function() n * pi
})

my_fcn <- function(prune = FALSE) {
  do_call(g, prune = prune)
}

my_fcn()
#> Size of 'g': 8055064 bytes
#> [1] 6.28
my_fcn(prune = TRUE)
#> Size of 'g': 51043 bytes
#> [1] 6.28

rm(list = c("cargo", "n"))


## WARNING: Large objects inside local environments of
##          the function will not the pruned!
g <- local({
  cargo <- rnorm(1e6)
  n <- 2  
  local({
    pi <- 3.14
    function() n * pi
  })
})

my_fcn <- function(prune = FALSE) {
  do_call(g, prune = prune)
}

my_fcn()
#> Size of 'g': 8055204 bytes
#> [1] 6.28
my_fcn(prune = TRUE)
#> Size of 'g': 8051236 bytes
#> [1] 6.28