Compressing and decompressing files such as gzip:ed and bzip2:ed files.

NOTE: The default (remove=TRUE) behavior is that the input file is removed after that the output file is fully created and closed.

# S3 method for default
compressFile(filename, destname=sprintf("%s.%s", filename, ext), ext, FUN,
  temporary=FALSE, skip=FALSE, overwrite=FALSE, remove=TRUE, BFR.SIZE=1e+07, ...)
 # S3 method for default
decompressFile(filename, destname=gsub(sprintf("[.]%s$", ext), "", filename,
  ignore.case = TRUE), ext, FUN, temporary=FALSE, skip=FALSE, overwrite=FALSE,
  remove=TRUE, BFR.SIZE=1e+07, ...)
 # S3 method for default
isCompressedFile(filename, method=c("extension", "content"), ext, fileClass, ...)
 # S3 method for default
bzip2(filename, ..., ext="bz2", FUN=bzfile)
 # S3 method for default
bunzip2(filename, ..., ext="bz2", FUN=bzfile)
 # S3 method for default
gzip(filename, ..., ext="gz", FUN=gzfile)
 # S3 method for default
gunzip(filename, ..., ext="gz", FUN=gzfile)

Arguments

filename

Pathname of input file.

destname

Pathname of output file.

temporary

If TRUE, the output file is created in a temporary directory.

skip

If TRUE and the output file already exists, the output file is returned as is.

overwrite

If TRUE and the output file already exists, the file is silently overwritten, otherwise an exception is thrown (unless skip is TRUE).

remove

If TRUE, the input file is removed afterward, otherwise not.

BFR.SIZE

The number of bytes read in each chunk.

...

Passed to the underlying function or alternatively not used.

method

A character string specifying how to infer whether a file is compressed or not.

ext, fileClass, FUN

(internal) Filename extension, file class, and a connection function used to read from/write to file.

Value

Returns the pathname of the output file. The number of bytes processed is returned as an attribute.

isCompressedFile(), isGzipped() and isBzipped()

return a logical. Note that with method = "extension" (default), only the filename extension is used to infer whether the file is compressed or not. Specifically, it does not matter whether the file actually exists or not.

Details

Internally bzfile() and gzfile() (see connections) are used to read (write) files. If the process is interrupted before completed, the partially written output file is automatically removed.

Author

Henrik Bengtsson

Examples

  ## bzip2
  cat(file="foo.txt", "Hello world!")
  print(isBzipped("foo.txt"))
#> [1] FALSE
  print(isBzipped("foo.txt.bz2"))
#> [1] TRUE

  bzip2("foo.txt")
  print(file.info("foo.txt.bz2"))
#>             size isdir mode               mtime               ctime
#> foo.txt.bz2   51 FALSE  664 2023-11-17 17:03:35 2023-11-17 17:03:35
#>                           atime  uid  gid  uname grname
#> foo.txt.bz2 2023-11-17 17:03:35 1000 1000 henrik henrik
  print(isBzipped("foo.txt"))
#> [1] FALSE
  print(isBzipped("foo.txt.bz2"))
#> [1] TRUE

  bunzip2("foo.txt.bz2")
  print(file.info("foo.txt"))
#>         size isdir mode               mtime               ctime
#> foo.txt   12 FALSE  664 2023-11-17 17:03:35 2023-11-17 17:03:35
#>                       atime  uid  gid  uname grname
#> foo.txt 2023-11-17 17:03:35 1000 1000 henrik henrik

  ## gzip
  cat(file="foo.txt", "Hello world!")
  print(isGzipped("foo.txt"))
#> [1] FALSE
  print(isGzipped("foo.txt.gz"))
#> [1] TRUE

  gzip("foo.txt")
  print(file.info("foo.txt.gz"))
#>            size isdir mode               mtime               ctime
#> foo.txt.gz   32 FALSE  664 2023-11-17 17:03:35 2023-11-17 17:03:35
#>                          atime  uid  gid  uname grname
#> foo.txt.gz 2023-11-17 17:03:35 1000 1000 henrik henrik
  print(isGzipped("foo.txt"))
#> [1] FALSE
  print(isGzipped("foo.txt.gz"))
#> [1] TRUE

  gunzip("foo.txt.gz")
  print(file.info("foo.txt"))
#>         size isdir mode               mtime               ctime
#> foo.txt   12 FALSE  664 2023-11-17 17:03:35 2023-11-17 17:03:35
#>                       atime  uid  gid  uname grname
#> foo.txt 2023-11-17 17:03:35 1000 1000 henrik henrik

  ## Cleanup
  file.remove("foo.txt")
#> [1] TRUE