| Title: | Miscellaneous Utilities Developed by the Predictive Ecology Group |
|---|---|
| Description: | Miscellaneous utilities developed by the Predictive Ecology Group (<http://predictiveecology.org>). |
| Authors: | Eliot J B McIntire [aut, cre] (ORCID: <https://orcid.org/0000-0002-6914-8316>), Alex M Chubaty [aut] (ORCID: <https://orcid.org/0000-0001-7146-8135>), His Majesty the King in Right of Canada, as represented by the Minister of Natural Resources Canada [cph] |
| Maintainer: | Eliot J B McIntire <[email protected]> |
| License: | GPL-3 |
| Version: | 0.0.4.9016 |
| Built: | 2026-06-03 22:17:58 UTC |
| Source: | https://github.com/PredictiveEcology/pemisc |
pemisc packageMiscellaneous utilities developed by the Predictive Ecology Group (https://predictiveecology.org).
This reports the 'available' memory from a system call: free on Linux,
vm_stat on macOS, or wmic on Windows.
If neither is installed on the system, returns NULL.
availableMemory()availableMemory()
Numeric of class "object_size", so it can be reported in any units with format,
e.g., format(availableMemory(), unit = "GB").
man free for description of available memory estimation.
Based on DBH or DBH and Height.
biomassCalculation(species, DBH, includeHeight, height, equationSource) ## S4 method for signature 'character,numeric,logical,numeric,character' biomassCalculation(species, DBH, includeHeight, height, equationSource) ## S4 method for signature 'character,numeric,missing,numeric,character' biomassCalculation(species, DBH, height, equationSource) ## S4 method for signature 'character,numeric,logical,numeric,missing' biomassCalculation(species, DBH, includeHeight, height) ## S4 method for signature 'character,numeric,missing,numeric,missing' biomassCalculation(species, DBH, height) ## S4 method for signature 'character,numeric,missing,missing,character' biomassCalculation(species, DBH, equationSource) ## S4 method for signature 'character,numeric,missing,missing,missing' biomassCalculation(species, DBH)biomassCalculation(species, DBH, includeHeight, height, equationSource) ## S4 method for signature 'character,numeric,logical,numeric,character' biomassCalculation(species, DBH, includeHeight, height, equationSource) ## S4 method for signature 'character,numeric,missing,numeric,character' biomassCalculation(species, DBH, height, equationSource) ## S4 method for signature 'character,numeric,logical,numeric,missing' biomassCalculation(species, DBH, includeHeight, height) ## S4 method for signature 'character,numeric,missing,numeric,missing' biomassCalculation(species, DBH, height) ## S4 method for signature 'character,numeric,missing,missing,character' biomassCalculation(species, DBH, equationSource) ## S4 method for signature 'character,numeric,missing,missing,missing' biomassCalculation(species, DBH)
species |
Character string giving the species name. |
DBH |
Numeric. The tree's diameter at breast height (DBH, cm). |
includeHeight |
Logical. Whether the biomass is calculated based on DBH and height.
If |
height |
Numeric. The tree's height (m). |
equationSource |
Character. Determine the sources of equations.
Currently, this function has two options, i.e., |
Biomass (kg) and missedSpecies list that was not calculated.
Yong Luo
## Not run: DBH <- seq(1, 100, 5) species <- c(rep("jack pine", 10), rep("black spruce", 10)) species[1] <- "wrongSpecies" height <- seq(20, 40, length = 20) # without height information and and taking the equations from Lambert 2005 biomass1 <- biomassCalculation(species = species, DBH = DBH) # with height information and and taking the equations from Lambert 2005 biomass2 <- biomassCalculation(species = species, DBH = DBH, includeHeight = TRUE, height = height) ## End(Not run)## Not run: DBH <- seq(1, 100, 5) species <- c(rep("jack pine", 10), rep("black spruce", 10)) species[1] <- "wrongSpecies" height <- seq(20, 40, length = 20) # without height information and and taking the equations from Lambert 2005 biomass1 <- biomassCalculation(species = species, DBH = DBH) # with height information and and taking the equations from Lambert 2005 biomass2 <- biomassCalculation(species = species, DBH = DBH, includeHeight = TRUE, height = height) ## End(Not run)
.prj fileIn cases where a shapefile is missing its associated .prj file.
createPrjFile( shpFile, urlForProj = "http://spatialreference.org/ref/epsg/nad83-utm-zone-11n/prj/" )createPrjFile( shpFile, urlForProj = "http://spatialreference.org/ref/epsg/nad83-utm-zone-11n/prj/" )
shpFile |
The filename of a shapefile to add |
urlForProj |
The url from which to fetch the projection, e.g.,
|
raster::factorValues()
Note there is an option to remove the NAs, which will make it MUCH faster,
if TRUE
factorValues2(x, v, layer, att, append.names, na.rm = FALSE)factorValues2(x, v, layer, att, append.names, na.rm = FALSE)
x |
Raster* object |
v |
integer cell values |
layer |
integer > 0 indicating which layer to use (in a RasterStack or RasterBrick) |
att |
numeric or character. Which variable(s) in the RAT table should be used. If |
append.names |
logical. Should names of data.frame returned by a combination of the name of the layer and the RAT variables? (can be useful for multilayer objects |
na.rm |
Logical. If |
Search among local objects (which will often be arguments passed into a function) as well as
dot objects to match the formals needed by fn.
If localFormalArgs is named, then it will match the formal
(name of localFormalArgs) with the local object,
e.g., localFormalArgs = c(x = "obj") will find the object in the local environment called
"obj", and this will be found because it matches the x argument in fn.
getLocalArgsFor(fn, localFormalArgs, envir, dots)getLocalArgsFor(fn, localFormalArgs, envir, dots)
fn |
Function name(?) |
localFormalArgs |
A (named) character vector or arguments to |
envir |
The environment in which to (???) |
dots |
TODO: need description |
List of named objects. The names are the formals in fn, and
the objects are the values for those formals.
This can easily be passed to do.call(fn, args1)
Get or update packages
getOrUpdatePkg(p, minVer = "0")getOrUpdatePkg(p, minVer = "0")
p |
character string denoting a package name |
minVer |
character string denoting the minimum package version |
invoked for side effect of installing packages
repo/package@branch stringGet the package name from a GitHub repo/package@branch string
ghPkgName(x)ghPkgName(x)
x |
character vector of package names |
a named character vector
pkgs <- c("dplyr", "PredictiveEcology/pemisc", "PredictiveEcology/SpaDES.core@development") ghPkgName(pkgs) ## "dplyr" "pemisc" "SpaDES.core"pkgs <- c("dplyr", "PredictiveEcology/pemisc", "PredictiveEcology/SpaDES.core@development") ghPkgName(pkgs) ## "dplyr" "pemisc" "SpaDES.core"
When running arbitrary functions inside other functions, there is a common
construct in R to use .... It does not work, however, in the general case
to write do.call(fn, list(...)) because not all fn themselves
accept .... So this will fail if too many arguments are supplied to
the .... In the general case, we want to write:
do.call(fn, list(onlyTheArgumentsThatAreNeeded)). This function helps
to find the onlyTheArgumentsThatAreNeeded by determining a) what is needed
by the fn (which can be a list of many fn), and b) where to find
values, either in an arbitrary environment or passed in via dots.
identifyVectorArgs(fn, localFormalArgs, envir, dots)identifyVectorArgs(fn, localFormalArgs, envir, dots)
fn |
A function or list of functions from which to run |
localFormalArgs |
A vector of possible objects, e.g., from |
envir |
The environment to find the objects named in |
dots |
Generally list(...), which would be an alternative place to find
|
A list of length 2, named argsSingle and argsMulti, which
can be passed to e.g.,
MapOrDoCall(fn, multiple = args1$argsMulti, single = args1$argsSingle)
Determines whether a string that may correspond to a package name
(e.g., repo/package@branch), could be a package installed from GitHub.
This is determined solely by the presence of a / in the string.
See example below.
isGitHubPkg(x)isGitHubPkg(x)
x |
character vector of package names |
a named logical vector
pkgs <- c("dplyr", "PredictiveEcology/pemisc", "PredictiveEcology/SpaDES.core@development") isGitHubPkg(pkgs) ## FALSE TRUE TRUEpkgs <- c("dplyr", "PredictiveEcology/pemisc", "PredictiveEcology/SpaDES.core@development") isGitHubPkg(pkgs) ## FALSE TRUE TRUE
makeForkCluster with random seed setThis will set different random seeds on the clusters (not the default)
with makeForkCluster.
It also defaults to creating a logfile with message of where it is.
makeClusterRandom( ..., type = "SOCK", iseed = NULL, libraries = NULL, objects = NULL, envir = parent.frame() ) makeForkClusterRandom(..., iseed = NULL) makeSockClusterRandom(..., iseed = NULL)makeClusterRandom( ..., type = "SOCK", iseed = NULL, libraries = NULL, objects = NULL, envir = parent.frame() ) makeForkClusterRandom(..., iseed = NULL) makeSockClusterRandom(..., iseed = NULL)
... |
passed to |
type |
One of the supported types: see ‘Details’. For
|
iseed |
passed to |
libraries |
A character vector of libraries to load in the SOCK cluster. This is ignored if a "FORK" cluster |
objects |
a character string of objects that are required inside the SOCK cluster. Ignored if type != "SOCK" |
envir |
Required if |
makeIpsForNetworkCluster is a simple wrapper around makeIps.
makeIpsForNetworkCluster( ipStart = "10.20.0", ipEnd = c(68, 97, 189, 213, 220, 58, 106, 184, 217), availableCores = c(50, 50, 50, 50, 50, 50, 23, 23, 23), availableRAM = c(950, 500, 500, 500, 500, 500, 245, 245, 245), nProcess = 8, proc = "cores", internalProcesses = 10, sizeGbEachProcess = 35, localHostEndIp = 68 ) makeIps(machines, ipStart, proc, nProcess, sizeGbEachProcess)makeIpsForNetworkCluster( ipStart = "10.20.0", ipEnd = c(68, 97, 189, 213, 220, 58, 106, 184, 217), availableCores = c(50, 50, 50, 50, 50, 50, 23, 23, 23), availableRAM = c(950, 500, 500, 500, 500, 500, 245, 245, 245), nProcess = 8, proc = "cores", internalProcesses = 10, sizeGbEachProcess = 35, localHostEndIp = 68 ) makeIps(machines, ipStart, proc, nProcess, sizeGbEachProcess)
ipStart |
Network address prefix (i.e., the first, second, and third triplets of the IP address) |
ipEnd |
Host IP address identifier (i.e., the final triplet of the IP address) |
availableCores |
the number of available threads on each machine. |
availableRAM |
the available RAM on each machine in GB |
nProcess |
the number of processes |
proc |
one of |
internalProcesses |
DESCRIPTION NEEDED |
sizeGbEachProcess |
the size in GB of each process |
localHostEndIp |
the address in |
machines |
|
A vector of IP addresses associated with each machine in the network cluster.
Given the size of a problem, it may not be useful to create a cluster. This will make a fork cluster (so Linux only).
makeOptimalCluster( useParallel = getOption("pemisc.useParallel", FALSE), MBper = 500, maxNumClusters = parallelly::availableCores(constraints = "connections"), assumeHyperThreads = FALSE, ... )makeOptimalCluster( useParallel = getOption("pemisc.useParallel", FALSE), MBper = 500, maxNumClusters = parallelly::availableCores(constraints = "connections"), assumeHyperThreads = FALSE, ... )
useParallel |
Logical or numeric. If |
MBper |
Numeric. Passed to |
maxNumClusters |
Numeric or Integer. The theoretical upper limit for number of nodes to use with the cluster. |
assumeHyperThreads |
Logical. If |
... |
Passed to |
Map and parallel::clusterMap togetherThis will send to Map or clusterMap, depending on whether cl is provided.
Because they use different argument names for the main function
to call, leave that argument unnamed.
Map2(f, ..., cl = NULL)Map2(f, ..., cl = NULL)
f |
passed as |
... |
passed to |
cl |
A cluster object, passed to |
## Not run: a <- 1:5 Map2(a, f = function(x) x) ## End(Not run)## Not run: a <- 1:5 Map2(a, f = function(x) x) ## End(Not run)
Map/lapply all in oneUsually run after identifyVectorArgs which will separate the arguments
into vectors of values for a call to Map, and arguments that have
only one value (passed to MoreArgs in Map). If all are single
length arguments, then it will pass to lapply. If a cl is provided
and is non-NULL, then it will pass all arguments to clusterMap or
clusterApply.
MapOrDoCall(fn, multiple, single, useCache = FALSE, cl = NULL)MapOrDoCall(fn, multiple, single, useCache = FALSE, cl = NULL)
fn |
The function that will be run via |
multiple |
This a list the arguments that Map will cycle over. |
single |
Passed to |
useCache |
Logical indicating whether to use the cache. |
cl |
A cluster object or |
identifyVectorArgs
Sends to message, but in a structured way so that a data.frame-like can
be cleanly sent to messaging.
messageDF(df, round, colour = NULL)messageDF(df, round, colour = NULL)
df |
A data.frame, data.table, matrix |
round |
An optional numeric to pass to |
colour |
An optional colour to use from |
RasterStack
Rescales the values of of each RasterLayer between [0,1].
normalizeStack(x)normalizeStack(x)
x |
A |
Tati Micheletti
This uses ps -ef so only works on unix-alikes. It will search
for the percent CPU use and select only those above 40
numActiveThreads(pattern = "--slave", minCPU = 50)numActiveThreads(pattern = "--slave", minCPU = 50)
pattern |
Character string that will be matched to the |
minCPU |
A numeric indicating what percent is the minimum to be considered "active" |
A numeric of the number of active threads that match the pattern
Eliot McIntire
## Not run: ## Determine how many threads are used in each remote machine in a cluster cores = "localhost" # put other machine names here uniqueCores <- unique(cores) cl <- future::makeClusterPSOCK(uniqueCores, revtunnel = TRUE) clusterExport(cl, "numActiveThreads") out <- clusterEvalQ(cl, { numActiveThreads() }) names(out) <- uniqueCores unlist(out) stopCluster(cl) ## End(Not run)## Not run: ## Determine how many threads are used in each remote machine in a cluster cores = "localhost" # put other machine names here uniqueCores <- unique(cores) cl <- future::makeClusterPSOCK(uniqueCores, revtunnel = TRUE) clusterExport(cl, "numActiveThreads") out <- clusterEvalQ(cl, { numActiveThreads() }) names(out) <- uniqueCores unlist(out) stopCluster(cl) ## End(Not run)
Optimally determine the number of cores to use to set up a new cluster, based on:
the number of cores available (see note);
the amount of free memory available on the local machine;
the number of cores requested vs. the number available, such that if requesting more cores than available, the number of cores used will be adjusted to be a multiple of the number of cores needed, so jobs can be run in approximately-even-sized batches. (E.g., if 16 cores available but need 50, the time taken to run 3 batches of 16 plus a single batch of 2 – i.e., 4 batches total – is the same as running 4 batches of 13.)
optimalClusterNumGeneralized( memRequiredMB = 500, maxNumClusters = parallelly::availableCores(constraints = "connections"), NumCoresAvailable = parallelly::availableCores(constraints = "connections"), availMem = pemisc::availableMemory()/1e+06 ) optimalClusterNum( memRequiredMB = 500, maxNumClusters = parallelly::availableCores(constraints = "connections") )optimalClusterNumGeneralized( memRequiredMB = 500, maxNumClusters = parallelly::availableCores(constraints = "connections"), NumCoresAvailable = parallelly::availableCores(constraints = "connections"), availMem = pemisc::availableMemory()/1e+06 ) optimalClusterNum( memRequiredMB = 500, maxNumClusters = parallelly::availableCores(constraints = "connections") )
memRequiredMB |
The amount of memory needed in MB |
maxNumClusters |
The number of nodes needed (requested) |
NumCoresAvailable |
The number of cores available on the local machine (see note). |
availMem |
The amount of free memory (RAM) available to use. |
integer specifying the number of cores
R hardcodes the maximum number of socket connections it can use (currently set to 128 in R 4.1). Three of these are reserved for the main R process, so practically speaking, a user can create at most 125 connections e.g., when creating a cluster. See https://github.com/HenrikBengtsson/Wishlist-for-R/issues/28.
We limit this a bit further here just in case the user already has open connections.
Uses igraph and Require::pkgDep.
pkgDepsGraph( pkgs = c("LandR", "pemisc", "map", "SpaDES", "SpaDES.tools", "SpaDES.core", "SpaDES.addins", "SpaDES.shiny", "reproducible", "quickPlot"), plot.it = TRUE )pkgDepsGraph( pkgs = c("LandR", "pemisc", "map", "SpaDES", "SpaDES.tools", "SpaDES.core", "SpaDES.addins", "SpaDES.shiny", "reproducible", "quickPlot"), plot.it = TRUE )
pkgs |
A character vector of package names. Default is
|
plot.it |
Logical. If |
A list of 2: dt a data.table of the dependencies, and dtGraph
an igraph object that can be plotted with plot()
Downloads data from CWFIS Datamart at http://cwfis.cfs.nrcan.gc.ca/datamart.
This runs prepInputs internally, so use can pass studyArea etc.
prepFireCanada( year, type = c("NBAC", "Polygon", "Point"), urlBase = "http://cwfis.cfs.nrcan.gc.ca/downloads/nbac/", ... )prepFireCanada( year, type = c("NBAC", "Polygon", "Point"), urlBase = "http://cwfis.cfs.nrcan.gc.ca/downloads/nbac/", ... )
year |
Numeric, length 1. Which year, from 1986 to 2018 (currently) to download |
type |
Either "NBAC", "Polygon" or "Point" to get the National Burn Area Composite, the Polygon or the Point datasets. |
urlBase |
The url of the directory where the NBAC are stored. Default is the currently known url. If this url becomes stale, please notify the predictive ecology team. |
... |
Additional arguments. |
A SpatialPolygonsDataFrame plus several downloaded files, including
the ‘.zip’ archive and the extracted files.
Because it is running prepInputs, checksumming is occurring too.
## Not run: # This will download 2 recent years library(sf) NBAC <- lapply(2016:2017, function(yr) a <- prepFireCanada(yr)) Points <- prepFireCanada(yr, type = "Points", fun = "st_read") Polygons <- prepFireCanada(yr, type = "Polygons") ## End(Not run)## Not run: # This will download 2 recent years library(sf) NBAC <- lapply(2016:2017, function(yr) a <- prepFireCanada(yr)) Points <- prepFireCanada(yr, type = "Points", fun = "st_read") Polygons <- prepFireCanada(yr, type = "Polygons") ## End(Not run)
This extracts or creates a new raster layer, whose intention is to be used as
the rasterToMatch argument in further prepInputs calls.
rasterToMatch(x, ...) ## S4 method for signature 'Raster' rasterToMatch(x, studyArea, ...) ## S4 method for signature 'SpatialPolygonsDataFrame' rasterToMatch(x, studyArea, rasterToMatch, ...)rasterToMatch(x, ...) ## S4 method for signature 'Raster' rasterToMatch(x, studyArea, ...) ## S4 method for signature 'SpatialPolygonsDataFrame' rasterToMatch(x, studyArea, rasterToMatch, ...)
x |
A Raster Layer with correct resolution and origin. |
... |
Additional arguments |
studyArea |
A |
rasterToMatch |
The raster to match in a |
A RasterLayer object.
Deprecated functionality
reproducibilityReceipt(title)reproducibilityReceipt(title)
title |
Header title for the inserted details section. |
Similar to terms, but this is used on a quoted model and
will only return unique matches in a data.
termsInData(model, data)termsInData(model, data)
model |
A quoted model statement |
data |
A data.frame-like object with column names in which to match terms in
|